Python API

SQLFluff exposes a public api for other python applications to use. A basic example of this usage is given here, with the documentation for each of the methods below.

"""This is an example of how to use the simple sqlfluff api."""

from typing import Any, Dict, Iterator, List, Union

import sqlfluff

#  -------- LINTING ----------

my_bad_query = "SeLEct  *, 1, blah as  fOO  from mySchema.myTable"

# Lint the given string and return an array of violations in JSON representation.
lint_result = sqlfluff.lint(my_bad_query, dialect="bigquery")
# lint_result =
# [
#     {
#         "code": "CP01",
#         "line_no": 1,
#         "line_pos": 1,
#         "description": "Keywords must be consistently upper case.",
#     }
#     ...
# ]

#  -------- FIXING ----------

# Fix the given string and get a string back which has been fixed.
fix_result_1 = sqlfluff.fix(my_bad_query, dialect="bigquery")
# fix_result_1 = 'SELECT  *, 1, blah AS  foo  FROM myschema.mytable\n'

# We can also fix just specific rules.
fix_result_2 = sqlfluff.fix(my_bad_query, rules=["CP01"])
# fix_result_2 = 'SELECT  *, 1, blah AS  fOO  FROM mySchema.myTable'

# Or a subset of rules...
fix_result_3 = sqlfluff.fix(my_bad_query, rules=["CP01", "CP02"])
# fix_result_3 = 'SELECT  *, 1, blah AS  fOO  FROM myschema.mytable'

#  -------- PARSING ----------

# Parse the given string and return a JSON representation of the parsed tree.
parse_result = sqlfluff.parse(my_bad_query)
# parse_result = {'file': {'statement': {...}, 'newline': '\n'}}

# This JSON structure can then be parsed as required.
# An example usage is shown below:


def get_json_segment(
    parse_result: Dict[str, Any], segment_type: str
) -> Iterator[Union[str, Dict[str, Any], List[Dict[str, Any]]]]:
    """Recursively search JSON parse result for specified segment type.

    Args:
        parse_result (Dict[str, Any]): JSON parse result from `sqlfluff.fix`.
        segment_type (str): The segment type to search for.

    Yields:
        Iterator[Union[str, Dict[str, Any], List[Dict[str, Any]]]]:
        Retrieves children of specified segment type as either a string for a raw
        segment or as JSON or an array of JSON for non-raw segments.
    """
    for k, v in parse_result.items():
        if k == segment_type:
            yield v
        elif isinstance(v, dict):
            yield from get_json_segment(v, segment_type)
        elif isinstance(v, list):
            for s in v:
                yield from get_json_segment(s, segment_type)


# e.g. Retrieve array of JSON for table references.
table_references = list(get_json_segment(parse_result, "table_reference"))
print(table_references)
# [[{'identifier': 'mySchema'}, {'dot': '.'}, {'identifier': 'myTable'}]]

# Retrieve raw table name from last identifier in the table reference.
for table_reference in table_references:
    table_name = list(get_json_segment(parse_result, "naked_identifier"))[-1]
    print(f"table_name: {table_name}")
# table_name: myTable

Simple API commands

Sqlfluff is a SQL linter for humans.

fix(sql: str, dialect: str = 'ansi', rules: List[str] | None = None, exclude_rules: List[str] | None = None, config: FluffConfig | None = None, config_path: str | None = None, fix_even_unparsable: bool | None = None) str

Fix a SQL string.

Parameters:
  • sql (str) – The SQL to be fixed.

  • dialect (str, optional) – A reference to the dialect of the SQL to be fixed. Defaults to ansi.

  • rules (Optional[List[str], optional) – A subset of rule references to fix for. Defaults to None.

  • exclude_rules (Optional[List[str], optional) – A subset of rule references to avoid fixing for. Defaults to None.

  • config (Optional[FluffConfig], optional) – A configuration object to use for the operation. Defaults to None.

  • config_path (Optional[str], optional) – A path to a .sqlfluff config, which is only used if a config is not already provided. Defaults to None.

  • fix_even_unparsable (bool, optional) – Optional override for the corresponding SQLFluff configuration value.

Returns:

str for the fixed SQL if possible.

lint(sql: str, dialect: str = 'ansi', rules: List[str] | None = None, exclude_rules: List[str] | None = None, config: FluffConfig | None = None, config_path: str | None = None) List[Dict[str, Any]]

Lint a SQL string.

Parameters:
  • sql (str) – The SQL to be linted.

  • dialect (str, optional) – A reference to the dialect of the SQL to be linted. Defaults to ansi.

  • rules (Optional[List[str], optional) – A list of rule references to lint for. Defaults to None.

  • exclude_rules (Optional[List[str], optional) – A list of rule references to avoid linting for. Defaults to None.

  • config (Optional[FluffConfig], optional) – A configuration object to use for the operation. Defaults to None.

  • config_path (Optional[str], optional) – A path to a .sqlfluff config, which is only used if a config is not already provided. Defaults to None.

Returns:

List[Dict[str, Any]] for each violation found.

parse(sql: str, dialect: str = 'ansi', config: FluffConfig | None = None, config_path: str | None = None) Dict[str, Any]

Parse a SQL string.

Parameters:
  • sql (str) – The SQL to be parsed.

  • dialect (str, optional) – A reference to the dialect of the SQL to be parsed. Defaults to ansi.

  • config (Optional[FluffConfig], optional) – A configuration object to use for the operation. Defaults to None.

  • config_path (Optional[str], optional) – A path to a .sqlfluff config, which is only used if a config is not already provided. Defaults to None.

Returns:

Dict[str, Any] JSON containing the parsed structure.

Note

In the case of multiple potential variants from the raw source file only the first variant is returned by the simple API. For access to the other variants, use the underlying main API directly.

Advanced API usage

The simple API presents only a fraction of the functionality present within the core SQLFluff library. For more advanced use cases, users can import the Linter() and FluffConfig() classes from sqlfluff.core. As of version 0.4.0 this is considered as experimental only as the internals may change without warning in any future release. If you come to rely on the internals of SQLFluff, please post an issue on GitHub to share what you’re up to. This will help shape a more reliable, tidy and well documented public API for use.

Configuring SQLFluff

You can use FluffConfig() class to configure SQLFluff behaviour.

"""This is an example of providing config overrides."""

from sqlfluff.core import FluffConfig, Linter

sql = "SELECT 1\n"


config = FluffConfig(
    overrides={
        "dialect": "snowflake",
        # NOTE: We explicitly set the string "none" here rather
        # than a None literal so that it overrides any config
        # set by any config files in the path.
        "library_path": "none",
    }
)

linted_file = Linter(config=config).lint_string(sql)

assert linted_file.get_violations() == []

Instances of FluffConfig() can be created manually, or parsed.

"""An example to show a few ways of configuring the API."""

import sqlfluff
from sqlfluff.core import FluffConfig, Linter

# #######################################
# The simple API can be configured in three ways.

# 1. Limited keyword arguments
sqlfluff.fix("SELECT  1", dialect="bigquery")

# 2. Providing the path to a config file
sqlfluff.fix("SELECT  1", config_path="test/fixtures/.sqlfluff")

# 3. Providing a preconfigured FluffConfig object.
# NOTE: This is the way of configuring SQLFluff which will give the most control.

# 3a. FluffConfig objects can be created directly from a dictionary of values.
config = FluffConfig(configs={"core": {"dialect": "bigquery"}})
# 3b. FluffConfig objects can be created from a config file in a string.
config = FluffConfig.from_string("[sqlfluff]\ndialect=bigquery\n")
# 3c. FluffConfig objects can be created from a config file in multiple strings
#     to simulate the effect of multiple nested config strings.
config = FluffConfig.from_strings(
    # NOTE: Given these two strings, the resulting dialect would be "mysql"
    # as the later files take precedence.
    "[sqlfluff]\ndialect=bigquery\n",
    "[sqlfluff]\ndialect=mysql\n",
)
# 3d. FluffConfig objects can be created from a path containing a config file.
config = FluffConfig.from_path("test/fixtures/")
# 3e. FluffConfig objects can be from keyword arguments
config = FluffConfig.from_kwargs(dialect="bigquery", rules=["LT01"])

# The FluffConfig is then provided via a config argument.
sqlfluff.fix("SELECT  1", config=config)


# #######################################
# The core API is always configured using a FluffConfig object.

# When instantiating a Linter (or Parser), a FluffConfig must be provided
# on instantiation. See above for details on how to create a FluffConfig.
linter = Linter(config=config)

# The provided config will then be used in any operations.

lint_result = linter.lint_string("SELECT  1", fix=True)
fixed_string = lint_result.fix_string()
# NOTE: The "True" element shows that fixing was a success.
assert fixed_string == ("SELECT 1", True)

Supported dialects and rules are available through list_dialects() and list_rules().

"""This is an example of how get basic options from sqlfluff."""

import sqlfluff

#  -------- DIALECTS ----------

dialects = sqlfluff.list_dialects()
# dialects = [DialectTuple(label='ansi', name='ansi', inherits_from='nothing'), ...]
dialect_names = [dialect.label for dialect in dialects]
# dialect_names = ["ansi", "snowflake", ...]


#  -------- RULES ----------

rules = sqlfluff.list_rules()
# rules = [
#     RuleTuple(
#         code='Example_LT01',
#         description='ORDER BY on these columns is forbidden!'
#     ),
#     ...
# ]
rule_codes = [rule.code for rule in rules]
# rule_codes = ["LT01", "LT02", ...]

Advanced API reference

The core elements of sqlfluff.

class Lexer(config: FluffConfig | None = None, last_resort_lexer: StringLexer | None = None, dialect: str | None = None)

The Lexer class actually does the lexing step.

elements_to_segments(elements: List[TemplateElement], templated_file: TemplatedFile) Tuple[RawSegment, ...]

Convert a tuple of lexed elements into a tuple of segments.

lex(raw: str | TemplatedFile) Tuple[Tuple[BaseSegment, ...], List[SQLLexError]]

Take a string or TemplatedFile and return segments.

If we fail to match the whole string, then we must have found something that we cannot lex. If that happens we should package it up as unlexable and keep track of the exceptions.

static lex_match(forward_string: str, lexer_matchers: List[StringLexer]) LexMatch

Iteratively match strings using the selection of submatchers.

static map_template_slices(elements: List[LexedElement], template: TemplatedFile) List[TemplateElement]

Create a tuple of TemplateElement from a tuple of LexedElement.

This adds slices in the templated file to the original lexed elements. We’ll need this to work out the position in the source file.

static violations_from_segments(segments: Tuple[RawSegment, ...]) List[SQLLexError]

Generate any lexing errors for any unlexables.

class Linter(config: FluffConfig | None = None, formatter: Any = None, dialect: str | None = None, rules: List[str] | None = None, user_rules: List[Type[BaseRule]] | None = None, exclude_rules: List[str] | None = None)

The interface class to interact with the linter.

classmethod allowed_rule_ref_map(reference_map: Dict[str, Set[str]], disable_noqa_except: str | None) Dict[str, Set[str]]

Generate a noqa rule reference map.

fix(tree: BaseSegment, config: FluffConfig | None = None, fname: str | None = None, templated_file: TemplatedFile | None = None) Tuple[BaseSegment, List[SQLBaseError]]

Return the fixed tree and violations from lintfix when we’re fixing.

get_rulepack(config: FluffConfig | None = None) RulePack

Get hold of a set of rules.

lint(tree: BaseSegment, config: FluffConfig | None = None, fname: str | None = None, templated_file: TemplatedFile | None = None) List[SQLBaseError]

Return just the violations from lintfix when we’re only linting.

classmethod lint_fix_parsed(tree: BaseSegment, config: FluffConfig, rule_pack: RulePack, fix: bool = False, fname: str | None = None, templated_file: TemplatedFile | None = None, formatter: Any = None) Tuple[BaseSegment, List[SQLBaseError], IgnoreMask | None, List[Tuple[str, str, float]]]

Lint and optionally fix a tree object.

classmethod lint_parsed(parsed: ParsedString, rule_pack: RulePack, fix: bool = False, formatter: Any = None, encoding: str = 'utf8') LintedFile

Lint a ParsedString and return a LintedFile.

lint_path(path: str, fix: bool = False, ignore_non_existent_files: bool = False, ignore_files: bool = True, processes: int | None = None) LintedDir

Lint a path.

lint_paths(paths: Tuple[str, ...], fix: bool = False, ignore_non_existent_files: bool = False, ignore_files: bool = True, processes: int | None = None, apply_fixes: bool = False, fixed_file_suffix: str = '', fix_even_unparsable: bool = False, retain_files: bool = True) LintingResult

Lint an iterable of paths.

classmethod lint_rendered(rendered: RenderedFile, rule_pack: RulePack, fix: bool = False, formatter: Any = None) LintedFile

Take a RenderedFile and return a LintedFile.

lint_string(in_str: str = '', fname: str = '<string input>', fix: bool = False, config: FluffConfig | None = None, encoding: str = 'utf8') LintedFile

Lint a string.

Returns:

an object representing that linted file.

Return type:

LintedFile

lint_string_wrapped(string: str, fname: str = '<string input>', fix: bool = False) LintingResult

Lint strings directly.

static load_raw_file_and_config(fname: str, root_config: FluffConfig) Tuple[str, FluffConfig, str]

Load a raw file and the associated config.

parse_path(path: str, parse_statistics: bool = False) Iterator[ParsedString]

Parse a path of sql files.

NB: This a generator which will yield the result of each file within the path iteratively.

classmethod parse_rendered(rendered: RenderedFile, parse_statistics: bool = False) ParsedString

Parse a rendered file.

parse_string(in_str: str, fname: str = '<string>', config: FluffConfig | None = None, encoding: str = 'utf-8', parse_statistics: bool = False) ParsedString

Parse a string.

static remove_templated_errors(linting_errors: List[SQLBaseError]) List[SQLBaseError]

Filter a list of lint errors, removing those from the templated slices.

render_file(fname: str, root_config: FluffConfig) RenderedFile

Load and render a file with relevant config.

render_string(in_str: str, fname: str, config: FluffConfig, encoding: str) RenderedFile

Template the file.

rule_tuples() List[RuleTuple]

A simple pass through to access the rule tuples of the rule set.

class Parser(config: FluffConfig | None = None, dialect: str | None = None)

Instantiates parsed queries from a sequence of lexed raw segments.

parse(segments: Sequence[BaseSegment], fname: str | None = None, parse_statistics: bool = False) BaseSegment | None

Parse a series of lexed tokens using the current dialect.