Skip to content

regexp: support lookaheads and lookbehinds #18868

@JulienChampseix

Description

@JulienChampseix

What version of Go are you using (go version)?

Go version 1.7.4

What operating system and processor architecture are you using (go env)?

Linux (Ubuntu and Debian)

What did you do?

We are using telegraf software and the logparser of this tools looks to be impacted by the non support of regex lookaheads or lookbehinds on this version of GO.

You can find more details here :

influxdata/telegraf#2178
influxdata/telegraf#2302
influxdata/telegraf#2343

Do you know if there is already a workaround this request ?

Activity

mvdan

mvdan commented on Jan 31, 2017

@mvdan
Member

There have been discussions on this before, see https://groups.google.com/forum/#!topic/golang-nuts/7qgSDWPIh_E.

The consensus from @rsc and @ianlancetaylor there is that it hasn't been done in an efficient way yet. I don't know if there have been any other developments.

changed the title [-]Support - Negative lookaheads[/-] [+]regexp: support lookaheads and lookbehinds[/+] on Jan 31, 2017
added
FeatureRequestIssues asking for a new feature that does not need a proposal.
on Jan 31, 2017
added this to the Unplanned milestone on Jan 31, 2017
smasher164

smasher164 commented on Nov 15, 2017

@smasher164
Member

One possible idea which I'm not sure has been suggested yet is to have multiple read-heads over the input. It is well known that a finite automata with two read-heads has more language-recognition power than a normal FA (for instance -- matching the language 0^n 1^n). In this case, if we wanted to stream the input, we assume the read-heads can only move left-to-right. Also, by increasing the number of read-heads linearly with the number of lookaround assertions, we can handle nested lookarounds.

In order to process a lookbehind assertion, we start stepping an additional read-head from the beginning of the input, and if the assertion is accepted by the time the two heads meet, we continue stepping the original read-head.

In order to process a lookahead assertion, we move the starting position of an additional read-head to match the position of the original. Then if the assertion is accepted, we switch back to the original head.

added
NeedsDecisionFeedback is required from experts, contributors, and/or the community before a change can be made.
on Nov 15, 2017
jonstewart

jonstewart commented on Nov 15, 2017

@jonstewart

I believe that lookaround assertions can be supported "efficiently" (i.e., with a one-time scan of the input) with an NFA. My colleague, Joel Uckelman, has done a substantial amount of work support lookaround assertions in lightgrep on an unfinished branch. I'm hoping we can pick it up again in the new year. The beauty of NFAs is that if you need to track more things, you just add more states. They aren't fast, though.

Regexp engines vary quite a bit in their support for types of lookaround assertions. Providing less than fully regular support may still be reasonable and useful.

rsc

rsc commented on Oct 2, 2018

@rsc
Contributor

Experience with RE2 has shown that it's OK - not ideal but OK - to live without these. I think it makes more sense for Go to continue to match RE2 than to try to strike out on its own here. And on top of that I still don't know how to implement them efficiently, despite thinking about this for over a decade.

locked and limited conversation to collaborators on Oct 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    ExpertNeededFeatureRequestIssues asking for a new feature that does not need a proposal.FrozenDueToAgeNeedsDecisionFeedback is required from experts, contributors, and/or the community before a change can be made.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @bradfitz@rsc@jonstewart@JulienChampseix@mvdan

        Issue actions

          regexp: support lookaheads and lookbehinds · Issue #18868 · golang/go