Regex nhyiamu nyinaa a wobɛhwehwɛ no ayɛ O(n2) bere nyinaa. | Mewayz Blog Skip to main content
Hacker News

Regex nhyiamu nyinaa a wobɛhwehwɛ no ayɛ O(n2) bere nyinaa.

Nsɛm a wɔka

13 min read Via iev.ee

Mewayz Team

Editorial Team

Hacker News

Ɛka a Ahintaw wɔ Nhwɛsode a ɛne ne ho hyia ho

Wɔ developers fam no, daa nsɛmfua (regex) yɛ adwinnade a ɛho nhia, Swiss Army sekan a wɔde hwehwɛ nsɛm mu, hyɛ mu den, na wɔyi nsɛm fi nsɛm mu. Efi email formats a wobɛhwɛ so kosi data a wobɛpopa afi logs mu so no, regex ne ano aduru a wobɛkɔ. Nanso, wɔ saa anim a tumi wom yi ase no, adwumayɛ afiri bi a ahaw nhyehyɛe ahorow mfe du du pii da: bere a ɛyɛ den sen biara a ɛne sɛ wobehu nsɛdi nyinaa wɔ ahama bi mu ne O(n2). Saa quadratic bere a ɛyɛ den yi kyerɛ sɛ bere a input string no nyin linearly, processing time no betumi anyin exponentially, na ɛde brɛoo a wɔnhwɛ kwan, resource a ɛbrɛ, ne adeyɛ bi a wɔfrɛ no ReDoS (Regular Expression Denial of Service) ba. Saa anohyetoɔ a ɛwɔ hɔ yi nteaseɛ ne anammɔn a ɛdi kan a ɛbɛma yɛakyekye dwumadie a ɛyɛ den na ɛyɛ adwuma yie.

Dɛn nti na Regex Matching O(n2) yɛ? Ɔhaw a Ɛwɔ Akyi Akyi

O(n2) a ɛyɛ den no ntini gyina adwinnade a atetesɛm regex engine dodow no ara de di dwuma no so: akyi a wɔsan. Sɛ regex engine, te sɛ deɛ ɛwɔ Perl, Python, anaa Java mu no bɔ mmɔden sɛ ɛbɛhwehwɛ nsɛsoɔ a ɛbɛtumi aba nyinaa a, ɛnyɛ scan string no pɛnkoro kɛkɛ. Ɛhwehwɛ akwan horow mu. Susuw nhwɛso a ɛnyɛ den te sɛ `(a+)+b` a wɔde di dwuma wɔ ahama a ne fã kɛse no ara yɛ "a", te sɛ "aaaaaaaaac". Engine no de adifudepɛ de "a" nyinaa ne `a+` a edi kan no hyia, afei ɛbɔ mmɔden sɛ ɛbɛma "b" a etwa to no ahyia. Sɛ ɛdi nkoguo a, ɛsan n'akyi—ɛnyɛ "a" a ɛtwa toɔ no nsɛ na ɛsɔ `+` quantifier hwɛ wɔ abɔnten kuw no so. Saa adeyɛ yi san yɛ bio, na ɛhyɛ engine no ma ɛsɔ sɛnea wobetumi akyekyɛ "a" ahorow no akuwakuw biara a wɔaka abom ahwɛ, na ɛde nneɛma a ebetumi aba a wɔaka abom a ɛpae ba. Akwan dodow a ɛsɛ sɛ engine no hwehwɛ mu no betumi ne ahama no tenten ahinanan no ahyia, enti O(n2).

  • Greedy Quantifiers: Nhwɛsoɔ te sɛ `.*` anaa `.+` di nsɛm pii sɛdeɛ ɛbɛyɛ yie mfitiaseɛ no, na ɛde kɔ akyi kɛseɛ berɛ a nhwɛsoɔ no afã a ɛdi akyire no ntumi nhyia.
  • Nested Quantifiers: Nsɛmfua te sɛ `(a+)+` anaa `(a*a*)*` yɛ akwan dodow a ɛkɔ soro a wɔfa so kyekyɛ nsɛm a wɔde hyɛ mu no mu, na ɛma bere a wɔde yɛ adwuma no kɔ soro kɛse.
  • Ambiguous Patterns: Sɛ wobetumi de ahama bi ahyia wɔ akwan pii a ɛka bom so a, ɛsɛ sɛ engine no hwɛ nea ebetumi aba biara mu na ama ahwehwɛ nea ɛne ne ho hyia nyinaa.

Wiase Nkɛntɛnso Ankasa: Ɛsen Nkɔmmɔbɔ Kɛse

Eyi nyɛ adesua mu asɛm kɛkɛ. Regex a entumi nyɛ adwuma yiye betumi de nea efi mu ba a emu yɛ den aba wɔ mmeae a wɔyɛ nneɛma. Data validation check a ɛte sɛ nea asiane biara nni ho betumi abɛyɛ bottleneck bere a woredi fael akɛse ho dwuma anaasɛ woredi user input dodow a ɛkɔ soro ho dwuma no. Nea efi mu ba a ɛyɛ hu sen biara ne ReDoS ntua, faako a odiyifo a ɔyɛ bɔne de ahama a wɔayɛ no yiye a ɛkanyan adwumayɛ a enye koraa wɔ wɛb aplikeshɔn bi regex mu ma, na ɛde server no sɛn so yiye na ɛma wɔn a wɔde di dwuma wɔ mmara kwan so no ntumi nyɛ adwuma. Wɔ nnwuma fam no, eyi kyerɛ tẽẽ sɛ bere a wɔde yɛ adwuma, sika a wɔhwere, ne din a wɔsɛe no. Sɛ worekyekye nhyehyɛe a ɛyɛ den, titiriw nea ɛyɛ data a wontumi mfa wɔn ho nto so ho adwuma a, sɛ́ wubehu saa regex afiri yi yɛ ahobammɔ ne adwumayɛ ho akontaabu fã titiriw.

"Bere bi na yɛwɔ nhyehyeɛ foforɔ ketewa bi a ɛde regex baeɛ sɛ ɛbɛkyekyɛ user-agent strings mu. Wɔ normal load ase no, na ɛyɛ yie. Nanso wɔ traffic spike mu no, ɛde cascading huammɔdi baeɛ a ɛyii yɛn API no baa fam simma kakraa bi. Ɔbɔnefoɔ no yɛ O(n2) regex a na yɛnnim da sɛ yɛwɔ." - DevOps Engineer Panyin bi

Wɔde Mewayz

bɛkyekyere Nhyehyɛe a Ɛyɛ Nyansa

Enti, yɛbɛyɛ dɛn akɔ akyiri asen saa anohyeto titiriw yi? Ano aduru no hwehwɛ sɛ wɔde nnwinnade a eye ne adansi ho nhyehyɛe a nyansa wom a wɔaka abom. Nea edi kan no, developers betumi de regex analyzers adi dwuma de ahu ɔhaw nhyehyɛe ahorow na wɔasan akyerɛw no sɛnea ɛbɛyɛ a ɛbɛyɛ adwuma yiye (e.g., wɔde possessive quantifiers anaa atom groups bedi dwuma). Sɛ wopɛ adwumayɛ a etwa to a, algorithms foforo wɔ hɔ a ɛma linear bere, O(n), ma pattern matching, ɛwom sɛ ɛntaa mma wɔ standard nhomakorabea ahorow mu de.

Eha ne baabi a modular adwumayɛ OS te sɛ Mewayz de mfaso kɛse ma. Mewayz ma wo kwan ma wokyekyɛ nneɛma mu na wohwɛ nneɛma a ɛho hia so. Sɛ anka wobɛnya monolithic application a regex baako a ɛyɛ brɛoo bɛtumi adi system no nyinaa dɛm no, wobɛtumi de microservice a wɔatu ho ama, a atew ne ho ahyɛ mu ama data parsing ne validation. Sɛ adwumayɛ ho asɛm bi sɔre a, ɛwɔ mu na wobetumi adi ho dwuma a ennya adwumayɛ dwumadi afoforo so nkɛntɛnso. Bio nso, nnwinnade a wɔde hwɛ nneɛma a ɛwɔ Mewayz platform no mu no betumi aboa wo ma woahu saa nnwuma a ɛnyɛ adwuma yi yiye ansa na anya w’atɔfo so nkɛntɛnso, na adan ɔhaw a ebetumi aba no ayɛ adwuma a ɛyɛ papa a wotumi di ho dwuma. Sɛ wode fapem a ɛyɛ mmerɛw na wotumi hu so si so a, wohwɛ sɛ w’adwuma mu ntease, a nsɛm a ɛyɛ den ka ho no, kɔ so yɛ adwuma na ɛyɛ den.

💡 DID YOU KNOW?

Mewayz replaces 8+ business tools in one platform

CRM · Invoicing · HR · Projects · Booking · eCommerce · POS · Analytics. Free forever plan available.

Start Free →

Nsɛmmisa a Wɔtaa Bisa

Ɛka a Ahintaw wɔ Nhwɛsode a ɛne ne ho hyia ho

Wɔ developers fam no, daa nsɛmfua (regex) yɛ adwinnade a ɛho nhia, Swiss Army sekan a wɔde hwehwɛ nsɛm mu, hyɛ mu den, na wɔyi nsɛm fi nsɛm mu. Efi email formats a wobɛhwɛ so kosi data a wobɛpopa afi logs mu so no, regex ne ano aduru a wobɛkɔ. Nanso, wɔ saa anim a tumi wom yi ase no, adwumayɛ afiri bi a ahaw nhyehyɛe ahorow mfe du du pii da: bere a ɛyɛ den sen biara a ɛne sɛ wobehu nsɛdi nyinaa wɔ ahama bi mu ne O(n2). Saa quadratic bere a ɛyɛ den yi kyerɛ sɛ bere a input string no nyin linearly no, processing time no betumi anyin exponentially, na ɛde slowdowns a wɔnhwɛ kwan, resource ɔbrɛ, ne adeyɛ bi a wɔfrɛ no ReDoS (Regular Expression Denial of Service) ba. Saa anohyetoɔ a ɛwɔ hɔ yi nteaseɛ ne anammɔn a ɛdi kan a ɛbɛma yɛakyekye dwumadie a ɛyɛ den na ɛyɛ adwuma yie.

Dɛn nti na Regex Matching O(n2) yɛ? Ɔhaw a Ɛwɔ Akyi Akyi

O(n2) a ɛyɛ den no ntini gyina adwinnade a atetesɛm regex engine dodow no ara de di dwuma no so: akyi a wɔsan. Sɛ regex engine, te sɛ deɛ ɛwɔ Perl, Python, anaa Java mu no bɔ mmɔden sɛ ɛbɛhwehwɛ nsɛsoɔ a ɛbɛtumi aba nyinaa a, ɛnyɛ scan string no pɛnkoro kɛkɛ. Ɛhwehwɛ akwan horow mu. Susuw nhwɛso a ɛnyɛ den te sɛ `(a+)+b` a wɔde di dwuma wɔ ahama a ne fã kɛse no ara yɛ "a", te sɛ "aaaaaaaaac". Engine no de adifudepɛ de "a" nyinaa ne `a+` a edi kan no hyia, afei ɛbɔ mmɔden sɛ ɛbɛma "b" a etwa to no ahyia. Sɛ ɛdi nkoguo a, ɛsan n'akyi—ɛnyɛ "a" a ɛtwa toɔ no nsɛ na ɛsɔ `+` quantifier hwɛ wɔ abɔnten kuw no so. Saa adeyɛ yi san yɛ bio, na ɛhyɛ engine no ma ɛsɔ sɛnea wobetumi akyekyɛ "a" ahorow no akuwakuw biara a wɔaka abom ahwɛ, na ɛde nneɛma a ebetumi aba a wɔaka abom a ɛpae ba. Akwan dodow a ɛsɛ sɛ engine no hwehwɛ mu no betumi ne ahama no tenten ahinanan no ahyia, enti O(n2).

Wiase Nkɛntɛnso Ankasa: Ɛsen Nkɔmmɔbɔ Kɛse

Eyi nyɛ adesua mu asɛm kɛkɛ. Regex a entumi nyɛ adwuma yiye betumi de nea efi mu ba a emu yɛ den aba wɔ mmeae a wɔyɛ nneɛma. Data validation check a ɛte sɛ nea asiane biara nni ho betumi abɛyɛ bottleneck bere a woredi fael akɛse ho dwuma anaasɛ woredi user input dodow a ɛkɔ soro ho dwuma no. Nea efi mu ba a ɛyɛ hu sen biara ne ReDoS ntua, faako a odiyifo a ɔyɛ bɔne de ahama a wɔayɛ no yiye a ɛkanyan adwumayɛ a enye koraa wɔ wɛb aplikeshɔn bi regex mu ma, na ɛde server no sɛn so yiye na ɛma wɔn a wɔde di dwuma wɔ mmara kwan so no ntumi nyɛ adwuma. Wɔ nnwuma fam no, eyi kyerɛ tẽẽ sɛ bere a wɔde yɛ adwuma, sika a wɔhwere, ne din a wɔsɛe no. Sɛ worekyekye nhyehyɛe a ɛyɛ den, titiriw nea ɛyɛ data a wontumi mfa wɔn ho nto so ho adwuma a, sɛ́ wubehu saa regex afiri yi yɛ ahobammɔ ne adwumayɛ ho akontaabu fã titiriw.

Wɔde Mewayz

bɛkyekyere Nhyehyɛe a Ɛyɛ Nyansa

Enti, yɛbɛyɛ dɛn akɔ akyiri asen saa anohyeto titiriw yi? Ano aduru no hwehwɛ sɛ wɔde nnwinnade a eye ne adansi ho nhyehyɛe a nyansa wom a wɔaka abom. Nea edi kan no, developers betumi de regex analyzers adi dwuma de ahu ɔhaw nhyehyɛe ahorow na wɔasan akyerɛw no sɛnea ɛbɛyɛ a ɛbɛyɛ adwuma yiye (e.g., wɔde possessive quantifiers anaa atom groups bedi dwuma). Sɛ wopɛ adwumayɛ a etwa to a, algorithms foforo wɔ hɔ a ɛma linear bere, O(n), ma pattern matching, ɛwom sɛ ɛntaa mma wɔ standard nhomakorabea ahorow mu de.

Yɛ Wo Adwumayɛ OS Ɛnnɛ

Efi freelancers so kosi agencies so, Mewayz de module ahorow 208 a wɔaka abom ma nnwuma 138,000+ tumi. Fi ase kwa, upgrade bere a woanyin.

Yɛ Akontaabu a Wontua hwee →

Try Mewayz Free

All-in-one platform for CRM, invoicing, projects, HR & more. No credit card required.

Start managing your business smarter today

Join 6,208+ businesses. Free forever plan · No credit card required.

Ready to put this into practice?

Join 6,208+ businesses using Mewayz. Free forever plan — no credit card required.

Start Free Trial →

Ready to take action?

Start your free Mewayz trial today

All-in-one business platform. No credit card required.

Start Free →

14-day free trial · No credit card · Cancel anytime