Developing Rules

Rules in SQLFluff are implemented as crawlers. These are entities which work their way through the parsed structure of a query to evaluate a particular rule or set of rules. The intent is that the definition of each specific rule should be really streamlined and only contain the logic for the rule itself, with all the other mechanics abstracted away.

Base Rules

base_rules Module

Implements the base rule class.

Rules crawl through the trees returned by the parser and evaluate particular rules.

The intent is that it should be possible for the rules to be expressed as simply as possible, with as much of the complexity abstracted away.

The evaluation function should take enough arguments that it can evaluate the position of the given segment in relation to its neighbors, and that the segment which finally “triggers” the error, should be the one that would be corrected OR if the rule relates to something that is missing, then it should flag on the segment FOLLOWING, the place that the desired element is missing.

class BaseRule(code, description, **kwargs)

The base class for a rule.

Parameters
  • code (str) – The identifier for this rule, used in inclusion or exclusion.

  • description (str) – A human readable description of what this rule does. It will be displayed when any violations are found.

crawl(segment, ignore_mask, dialect, parent_stack=None, siblings_pre=None, siblings_post=None, raw_stack=None, memory=None, fname=None, templated_file: Optional[sqlfluff.core.templaters.base.TemplatedFile] = None)

Recursively perform the crawl operation on a given segment.

Returns

A tuple of (vs, raw_stack, fixes, memory)

static discard_unsafe_fixes(lint_result: sqlfluff.core.rules.base.LintResult, templated_file: Optional[sqlfluff.core.templaters.base.TemplatedFile])

Remove (discard) LintResult fixes if they are “unsafe”.

By removing its fixes, a LintResult will still be reported, but it will be treated as _unfixable_.

static filter_meta(segments, keep_meta=False)

Filter the segments to non-meta.

Or optionally the opposite if keep_meta is True.

classmethod get_parent_of(segment, root_segment)

Return the segment immediately containing segment.

NB: This is recursive.

Parameters
  • segment – The segment to look for.

  • root_segment – Some known parent of the segment we’re looking for (although likely not the direct parent in question).

indent

String for a single indent, based on configuration.

is_final_segment(context: sqlfluff.core.rules.base.RuleContext) bool

Is the current segment the final segment in the parse tree.

static matches_target_tuples(seg: sqlfluff.core.parser.segments.base.BaseSegment, target_tuples: List[Tuple[str, str]])

Does the given segment match any of the given type tuples.

static split_comma_separated_string(raw_str: str) List[str]

Converts comma separated string to List, stripping whitespace.

class FunctionalRuleContext(context: sqlfluff.core.rules.base.RuleContext)

RuleContext written in a “functional” style; simplifies writing rules.

property parent_stack: sqlfluff.core.rules.functional.segments.Segments

Returns a Segments object for context.parent_stack.

raw_segments

Returns a Segments object for all the raw segments in the file.

raw_stack

Returns a Segments object for context.raw_stack.

segment

Returns a Segments object for context.segment.

property siblings_post: sqlfluff.core.rules.functional.segments.Segments

Returns a Segments object for context.siblings_post.

property siblings_pre: sqlfluff.core.rules.functional.segments.Segments

Returns a Segments object for context.siblings_pre.

class LintFix(edit_type: str, anchor: sqlfluff.core.parser.segments.base.BaseSegment, edit: Optional[Iterable[sqlfluff.core.parser.segments.base.BaseSegment]] = None, source: Optional[Iterable[sqlfluff.core.parser.segments.base.BaseSegment]] = None)

A class to hold a potential fix to a linting violation.

Parameters
  • edit_type (str) – One of create_before, create_after, replace, delete to indicate the kind of fix this represents.

  • anchor (BaseSegment) – A segment which represents the position that this fix should be applied at. For deletions it represents the segment to delete, for creations it implies the position to create at (with the existing element at this position to be moved after the edit), for a replace it implies the segment to be replaced.

  • edit (BaseSegment, optional) – For replace and create fixes, this holds the iterable of segments to create or replace at the given anchor point.

  • source (BaseSegment, optional) – For replace and create fixes, this holds iterable of segments that provided code. IMPORTANT: The linter uses this to prevent copying material from templated areas.

classmethod create_after(anchor_segment: sqlfluff.core.parser.segments.base.BaseSegment, edit_segments: Iterable[sqlfluff.core.parser.segments.base.BaseSegment], source: Optional[Iterable[sqlfluff.core.parser.segments.base.BaseSegment]] = None) sqlfluff.core.rules.base.LintFix

Create edit segments after the supplied anchor segment.

classmethod create_before(anchor_segment: sqlfluff.core.parser.segments.base.BaseSegment, edit_segments: Iterable[sqlfluff.core.parser.segments.base.BaseSegment], source: Optional[Iterable[sqlfluff.core.parser.segments.base.BaseSegment]] = None) sqlfluff.core.rules.base.LintFix

Create edit segments before the supplied anchor segment.

classmethod delete(anchor_segment: sqlfluff.core.parser.segments.base.BaseSegment) sqlfluff.core.rules.base.LintFix

Delete supplied anchor segment.

has_template_conflicts(templated_file: sqlfluff.core.templaters.base.TemplatedFile) bool

Does this fix conflict with (i.e. touch) templated code?

is_trivial()

Return true if the fix is trivial.

Trivial edits are: - Anything of zero length. - Any edits which result in themselves.

Removing these makes the routines which process fixes much faster.

classmethod replace(anchor_segment: sqlfluff.core.parser.segments.base.BaseSegment, edit_segments: Iterable[sqlfluff.core.parser.segments.base.BaseSegment], source: Optional[Iterable[sqlfluff.core.parser.segments.base.BaseSegment]] = None) sqlfluff.core.rules.base.LintFix

Replace supplied anchor segment with the edit segments.

class LintResult(anchor: Optional[sqlfluff.core.parser.segments.base.BaseSegment] = None, fixes: Optional[List[sqlfluff.core.rules.base.LintFix]] = None, memory=None, description=None)

A class to hold the results of a rule evaluation.

Parameters
  • anchor (BaseSegment, optional) – A segment which represents the position of the a problem. NB: Each fix will also hold its own reference to position, so this position is mostly for alerting the user to where the problem is.

  • fixes (list of LintFix, optional) – An array of any fixes which would correct this issue. If not present then it’s assumed that this issue will have to manually fixed.

  • memory (dict, optional) – An object which stores any working memory for the rule. The memory returned in any LintResult will be passed as an input to the next segment to be crawled.

  • description (str, optional) – A description of the problem identified as part of this result. This will override the description of the rule as what gets reported to the user with the problem if provided.

to_linting_error(rule) Optional[sqlfluff.core.errors.SQLLintError]

Convert a linting result to a SQLLintError if appropriate.

class RuleContext(segment: sqlfluff.core.parser.segments.base.BaseSegment, parent_stack: Tuple[sqlfluff.core.parser.segments.base.BaseSegment, ...], siblings_pre: Tuple[sqlfluff.core.parser.segments.base.BaseSegment, ...], siblings_post: Tuple[sqlfluff.core.parser.segments.base.BaseSegment, ...], raw_stack: Tuple[sqlfluff.core.parser.segments.raw.RawSegment, ...], memory: Any, dialect: sqlfluff.core.dialects.base.Dialect, path: Optional[pathlib.Path], templated_file: Optional[sqlfluff.core.templaters.base.TemplatedFile])

Class for holding the context passed to rule eval functions.

functional

Returns a Surrogates object that simplifies writing rules.

class RuleGhost(code, description)
property code

Alias for field number 0

property description

Alias for field number 1

class RuleLoggingAdapter(logger, extra)

A LoggingAdapter for rules which adds the code of the rule to it.

process(msg, kwargs)

Add the code element to the logging message before emit.

class RuleSet(name, config_info)

Class to define a ruleset.

A rule set is instantiated on module load, but the references to each of its classes are instantiated at runtime. This means that configuration values can be passed to those rules live and be responsive to any changes in configuration from the path that the file is in.

Rules should be fetched using the get_rulelist() command which also handles any filtering (i.e. allowlisting and denylisting).

New rules should be added to the instance of this class using the register() decorator. That decorator registers the class, but also performs basic type and name-convention checks.

The code for the rule will be parsed from the name, the description from the docstring. The eval function is assumed that it will be overriden by the subclass, and the parent class raises an error on this function if not overriden.

copy()

Return a copy of self with a separate register.

get_rulelist(config) List[sqlfluff.core.rules.base.BaseRule]

Use the config to return the appropriate rules.

We use the config both for allowlisting and denylisting, but also for configuring the rules given the given config.

Returns

list of instantiated BaseRule.

register(cls, plugin=None)

Decorate a class with this to add it to the ruleset.

@myruleset.register
class Rule_L001(BaseRule):
    "Description of rule."

    def eval(self, **kwargs):
        return LintResult()

We expect that rules are defined as classes with the name Rule_XXXX where XXXX is of the form LNNN, where L is a letter (literally L for linting by default) and N is a three digit number.

If this receives classes by any other name, then it will raise a ValueError.

property valid_rule_name_regex

Defines the accepted pattern for rule names.

The first group captures the plugin name (optional), which must be capitalized. The second group captures the rule code.

Examples of valid rule names:

  • Rule_PluginName_L001

  • Rule_L001

Functional API

These newer modules provide a higher-level API for rules working with segments and slices. Rules that need to navigate or search the parse tree may benefit from using these. Eventually, the plan is for all rules to use these modules. As of December 30, 2021, 17+ rules use these modules.

The modules listed below are submodules of sqlfluff.core.rules.functional.

segments Module

Surrogate class for working with Segment collections.

class Segments(*segments, templated_file=None)

Encapsulates a sequence of one or more BaseSegments.

The segments may or may not be contiguous in a parse tree. Provides useful operations on a sequence of segments to simplify rule creation.

all(predicate: Optional[Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]] = None) bool

Do all the segments match?

any(predicate: Optional[Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]] = None) bool

Do any of the segments match?

apply(fn: Callable[[sqlfluff.core.parser.segments.base.BaseSegment], Any]) List[Any]

Apply function to every item.

children(predicate: Optional[Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]] = None) sqlfluff.core.rules.functional.segments.Segments

Returns an object with children of the segments in this object.

find(segment: Optional[sqlfluff.core.parser.segments.base.BaseSegment]) int

Returns index if found, -1 if not found.

first(predicate: Optional[Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]] = None) sqlfluff.core.rules.functional.segments.Segments

Returns the first segment (if any) that satisfies the predicates.

get(index: int = 0, *, default: Optional[Any] = None) Optional[sqlfluff.core.parser.segments.base.BaseSegment]

Return specified item. Returns default if index out of range.

last(predicate: Optional[Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]] = None) sqlfluff.core.rules.functional.segments.Segments

Returns the last segment (if any) that satisfies the predicates.

property raw_segments: sqlfluff.core.rules.functional.segments.Segments

Get raw segments underlying the segments.

property raw_slices: sqlfluff.core.rules.functional.raw_file_slices.RawFileSlices

Raw slices of the segments, sorted in source file order.

recursive_crawl(*seg_type: str, recurse_into: bool = True) sqlfluff.core.rules.functional.segments.Segments

Recursively crawl for segments of a given type.

reversed() sqlfluff.core.rules.functional.segments.Segments

Return the same segments in reverse order.

select(select_if: Optional[Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]] = None, loop_while: Optional[Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]] = None, start_seg: Optional[sqlfluff.core.parser.segments.base.BaseSegment] = None, stop_seg: Optional[sqlfluff.core.parser.segments.base.BaseSegment] = None) sqlfluff.core.rules.functional.segments.Segments

Retrieve range/subset.

NOTE: Iterates the segments BETWEEN start_seg and stop_seg, i.e. those segments are not included in the loop.

segment_predicates Module

Defines commonly used segment predicates for rule writers.

For consistency, all the predicates in this module are implemented as functions returning functions. This avoids rule writers having to remember the distinction between normal functions and functions returning functions.

This is not necessarily a complete set of predicates covering all possible requirements. Rule authors can define their own predicates as needed, either as regular functions, lambda, etc.

and_(*functions: Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]) Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]

Returns a function that computes the functions and-ed together.

get_name() Callable[[sqlfluff.core.parser.segments.base.BaseSegment], str]

Returns a function that gets segment name.

is_code() Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]

Returns a function that checks if segment is code.

is_comment() Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]

Returns a function that checks if segment is comment.

is_expandable() Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]

Returns a function that checks if segment is expandable.

is_keyword(*keyword_name) Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]

Returns a function that determines if it’s a matching keyword.

is_meta() Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]

Returns a function that checks if segment is meta.

is_name(*seg_name: str) Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]

Returns a function that determines if segment is one of the names.

is_raw() Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]

Returns a function that checks if segment is raw.

is_type(*seg_type: str) Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]

Returns a function that determines if segment is one of the types.

is_whitespace() Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]

Returns a function that checks if segment is whitespace.

not_(fn: Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]) Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]

Returns a function that computes: not fn().

or_(*functions: Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]) Callable[[sqlfluff.core.parser.segments.base.BaseSegment], bool]

Returns a function that computes the functions or-ed together.

raw_slice(segment: sqlfluff.core.parser.segments.base.BaseSegment, raw_slice_: sqlfluff.core.templaters.base.RawFileSlice) str

Return the portion of a segment’s source provided by raw_slice.

raw_slices(segment: sqlfluff.core.parser.segments.base.BaseSegment, templated_file: Optional[sqlfluff.core.templaters.base.TemplatedFile]) sqlfluff.core.rules.functional.raw_file_slices.RawFileSlices

Returns raw slices for a segment.

templated_slices(segment: sqlfluff.core.parser.segments.base.BaseSegment, templated_file: Optional[sqlfluff.core.templaters.base.TemplatedFile]) sqlfluff.core.rules.functional.templated_file_slices.TemplatedFileSlices

Returns raw slices for a segment.

raw_file_slices Module

Surrogate class for working with RawFileSlice collections.

class RawFileSlices(*raw_slices, templated_file=None)

Encapsulates a sequence of one or more RawFileSlice.

The slices may or may not be contiguous in a file. Provides useful operations on a sequence of slices to simplify rule creation.

all(predicate: Optional[Callable[[sqlfluff.core.templaters.base.RawFileSlice], bool]] = None) bool

Do all the raw slices match?

any(predicate: Optional[Callable[[sqlfluff.core.templaters.base.RawFileSlice], bool]] = None) bool

Do any of the raw slices match?

select(select_if: Optional[Callable[[sqlfluff.core.templaters.base.RawFileSlice], bool]] = None, loop_while: Optional[Callable[[sqlfluff.core.templaters.base.RawFileSlice], bool]] = None, start_slice: Optional[sqlfluff.core.templaters.base.RawFileSlice] = None, stop_slice: Optional[sqlfluff.core.templaters.base.RawFileSlice] = None) sqlfluff.core.rules.functional.raw_file_slices.RawFileSlices

Retrieve range/subset.

NOTE: Iterates the slices BETWEEN start_slice and stop_slice, i.e. those slices are not included in the loop.

raw_file_slice_predicates Module

Defines commonly used raw file slice predicates for rule writers.

For consistency, all the predicates in this module are implemented as functions returning functions. This avoids rule writers having to remember the distinction between normal functions and functions returning functions.

This is not necessarily a complete set of predicates covering all possible requirements. Rule authors can define their own predicates as needed, either as regular functions, lambda, etc.

is_slice_type(*slice_types: str) Callable[[sqlfluff.core.templaters.base.RawFileSlice], bool]

Returns a function that determines if segment is one of the types.