Python API

SQLFluff exposes a public api for other python applications to use. A basic example of this usage is given here, with the documentation for each of the methods below.

"""This is an example of how to use the simple sqlfluff api."""

from typing import Any, Dict, Iterator, List, Union

import sqlfluff

#  -------- LINTING ----------

my_bad_query = "SeLEct  *, 1, blah as  fOO  from mySchema.myTable"

# Lint the given string and return an array of violations in JSON representation.
lint_result = sqlfluff.lint(my_bad_query, dialect="bigquery")
# lint_result =
# [
#     {
#         "code": "CP01",
#         "line_no": 1,
#         "line_pos": 1,
#         "description": "Keywords must be consistently upper case.",
#     }
#     ...
# ]

#  -------- FIXING ----------

# Fix the given string and get a string back which has been fixed.
fix_result_1 = sqlfluff.fix(my_bad_query, dialect="bigquery")
# fix_result_1 = 'SELECT  *, 1, blah AS  foo  FROM myschema.mytable\n'

# We can also fix just specific rules.
fix_result_2 = sqlfluff.fix(my_bad_query, rules=["CP01"])
# fix_result_2 = 'SELECT  *, 1, blah AS  fOO  FROM mySchema.myTable'

# Or a subset of rules...
fix_result_3 = sqlfluff.fix(my_bad_query, rules=["CP01", "CP02"])
# fix_result_3 = 'SELECT  *, 1, blah AS  fOO  FROM myschema.mytable'

#  -------- PARSING ----------

# Parse the given string and return a JSON representation of the parsed tree.
parse_result = sqlfluff.parse(my_bad_query)
# parse_result = {'file': {'statement': {...}, 'newline': '\n'}}

# This JSON structure can then be parsed as required.
# An example usage is shown below:


def get_json_segment(
    parse_result: Dict[str, Any], segment_type: str
) -> Iterator[Union[str, Dict[str, Any], List[Dict[str, Any]]]]:
    """Recursively search JSON parse result for specified segment type.

    Args:
        parse_result (Dict[str, Any]): JSON parse result from `sqlfluff.fix`.
        segment_type (str): The segment type to search for.

    Yields:
        Iterator[Union[str, Dict[str, Any], List[Dict[str, Any]]]]:
        Retrieves children of specified segment type as either a string for a raw
        segment or as JSON or an array of JSON for non-raw segments.
    """
    for k, v in parse_result.items():
        if k == segment_type:
            yield v
        elif isinstance(v, dict):
            yield from get_json_segment(v, segment_type)
        elif isinstance(v, list):
            for s in v:
                yield from get_json_segment(s, segment_type)


# e.g. Retrieve array of JSON for table references.
table_references = list(get_json_segment(parse_result, "table_reference"))
print(table_references)
# [[{'identifier': 'mySchema'}, {'dot': '.'}, {'identifier': 'myTable'}]]

# Retrieve raw table name from last identifier in the table reference.
for table_reference in table_references:
    table_name = list(get_json_segment(parse_result, "naked_identifier"))[-1]
    print(f"table_name: {table_name}")
# table_name: myTable

Simple API commands

Sqlfluff is a SQL linter for humans.

fix(sql: str, dialect: str = 'ansi', rules: List[str] | None = None, exclude_rules: List[str] | None = None, config: FluffConfig | None = None, config_path: str | None = None, fix_even_unparsable: bool | None = None) str

Fix a SQL string.

Parameters:
  • sql (str) – The SQL to be fixed.

  • dialect (str, optional) – A reference to the dialect of the SQL to be fixed. Defaults to ansi.

  • rules (Optional[List[str], optional) – A subset of rule references to fix for. Defaults to None.

  • exclude_rules (Optional[List[str], optional) – A subset of rule references to avoid fixing for. Defaults to None.

  • config (Optional[FluffConfig], optional) – A configuration object to use for the operation. Defaults to None.

  • config_path (Optional[str], optional) – A path to a .sqlfluff config, which is only used if a config is not already provided. Defaults to None.

  • fix_even_unparsable (bool, optional) – Optional override for the corresponding SQLFluff configuration value.

Returns:

str for the fixed SQL if possible.

lint(sql: str, dialect: str = 'ansi', rules: List[str] | None = None, exclude_rules: List[str] | None = None, config: FluffConfig | None = None, config_path: str | None = None) List[Dict[str, Any]]

Lint a SQL string.

Parameters:
  • sql (str) – The SQL to be linted.

  • dialect (str, optional) – A reference to the dialect of the SQL to be linted. Defaults to ansi.

  • rules (Optional[List[str], optional) – A list of rule references to lint for. Defaults to None.

  • exclude_rules (Optional[List[str], optional) – A list of rule references to avoid linting for. Defaults to None.

  • config (Optional[FluffConfig], optional) – A configuration object to use for the operation. Defaults to None.

  • config_path (Optional[str], optional) – A path to a .sqlfluff config, which is only used if a config is not already provided. Defaults to None.

Returns:

List[Dict[str, Any]] for each violation found.

parse(sql: str, dialect: str = 'ansi', config: FluffConfig | None = None, config_path: str | None = None) Dict[str, Any]

Parse a SQL string.

Parameters:
  • sql (str) – The SQL to be parsed.

  • dialect (str, optional) – A reference to the dialect of the SQL to be parsed. Defaults to ansi.

  • config (Optional[FluffConfig], optional) – A configuration object to use for the operation. Defaults to None.

  • config_path (Optional[str], optional) – A path to a .sqlfluff config, which is only used if a config is not already provided. Defaults to None.

Returns:

Dict[str, Any] JSON containing the parsed structure.

Note

In the case of multiple potential variants from the raw source file only the first variant is returned by the simple API. For access to the other variants, use the underlying main API directly.

Advanced API usage

The simple API presents only a fraction of the functionality present within the core SQLFluff library. For more advanced use cases, users can import the Linter() and FluffConfig() classes from sqlfluff.core. As of version 0.4.0 this is considered as experimental only as the internals may change without warning in any future release. If you come to rely on the internals of SQLFluff, please post an issue on GitHub to share what you’re up to. This will help shape a more reliable, tidy and well documented public API for use.

Configuring SQLFluff

You can use FluffConfig() class to configure SQLFluff behaviour.

"""This is an example of providing config overrides."""

from sqlfluff.core import FluffConfig, Linter

sql = "SELECT 1\n"


config = FluffConfig(
    overrides={
        "dialect": "snowflake",
        # NOTE: We explicitly set the string "none" here rather
        # than a None literal so that it overrides any config
        # set by any config files in the path.
        "library_path": "none",
    }
)

linted_file = Linter(config=config).lint_string(sql)

assert linted_file.get_violations() == []

Instances of FluffConfig() can be created manually, or parsed.

"""An example to show a few ways of configuring the API."""

import sqlfluff
from sqlfluff.core import FluffConfig, Linter

# #######################################
# The simple API can be configured in three ways.

# 1. Limited keyword arguments
sqlfluff.fix("SELECT  1", dialect="bigquery")

# 2. Providing the path to a config file
sqlfluff.fix("SELECT  1", config_path="test/fixtures/.sqlfluff")

# 3. Providing a preconfigured FluffConfig object.
# NOTE: This is the way of configuring SQLFluff which will give the most control.

# 3a. FluffConfig objects can be created directly from a dictionary of values.
config = FluffConfig(configs={"core": {"dialect": "bigquery"}})
# 3b. FluffConfig objects can be created from a config file in a string.
config = FluffConfig.from_string("[sqlfluff]\ndialect=bigquery\n")
# 3c. FluffConfig objects can be created from a config file in multiple strings
#     to simulate the effect of multiple nested config strings.
config = FluffConfig.from_strings(
    # NOTE: Given these two strings, the resulting dialect would be "mysql"
    # as the later files take precedence.
    "[sqlfluff]\ndialect=bigquery\n",
    "[sqlfluff]\ndialect=mysql\n",
)
# 3d. FluffConfig objects can be created from a path containing a config file.
config = FluffConfig.from_path("test/fixtures/")
# 3e. FluffConfig objects can be from keyword arguments
config = FluffConfig.from_kwargs(dialect="bigquery", rules=["LT01"])

# The FluffConfig is then provided via a config argument.
sqlfluff.fix("SELECT  1", config=config)


# #######################################
# The core API is always configured using a FluffConfig object.

# When instantiating a Linter (or Parser), a FluffConfig must be provided
# on instantiation. See above for details on how to create a FluffConfig.
linter = Linter(config=config)

# The provided config will then be used in any operations.

lint_result = linter.lint_string("SELECT  1", fix=True)
fixed_string = lint_result.fix_string()
# NOTE: The "True" element shows that fixing was a success.
assert fixed_string == ("SELECT 1", True)

Supported dialects and rules are available through list_dialects() and list_rules().

"""This is an example of how get basic options from sqlfluff."""

import sqlfluff

#  -------- DIALECTS ----------

dialects = sqlfluff.list_dialects()
# dialects = [DialectTuple(label='ansi', name='ansi', inherits_from='nothing'), ...]
dialect_names = [dialect.label for dialect in dialects]
# dialect_names = ["ansi", "snowflake", ...]


#  -------- RULES ----------

rules = sqlfluff.list_rules()
# rules = [
#     RuleTuple(
#         code='Example_LT01',
#         description='ORDER BY on these columns is forbidden!'
#     ),
#     ...
# ]
rule_codes = [rule.code for rule in rules]
# rule_codes = ["LT01", "LT02", ...]

Advanced API reference

The core elements of sqlfluff.

class FluffConfig(configs: ConfigMappingType | None = None, extra_config_path: str | None = None, ignore_local_config: bool = False, overrides: ConfigMappingType | None = None, plugin_manager: pluggy.PluginManager | None = None, require_dialect: bool = True)

The class that actually gets passed around as a config object.

diff_to(other: FluffConfig) Dict[str, Any]

Compare this config to another.

Parameters:

other (FluffConfig) – Another config object to compare against. We will return keys from this object that are not in other or are different to those in other.

Returns:

A filtered dict of items in this config that are not in the other or are different to the other.

classmethod from_kwargs(config: FluffConfig | None = None, dialect: str | None = None, rules: List[str] | None = None, exclude_rules: List[str] | None = None, require_dialect: bool = True) FluffConfig

Instantiate a config from either an existing config or kwargs.

This is a convenience method for the ways that the public classes like Linter(), Parser() and Lexer() can be instantiated with a FluffConfig or with the convenience kwargs: dialect & rules.

classmethod from_path(path: str, extra_config_path: str | None = None, ignore_local_config: bool = False, overrides: ConfigMappingType | None = None, plugin_manager: pluggy.PluginManager | None = None) FluffConfig

Loads a config object given a particular path.

classmethod from_root(extra_config_path: str | None = None, ignore_local_config: bool = False, overrides: Dict[str, Any] | None = None, **kw: Any) FluffConfig

Loads a config object just based on the root directory.

classmethod from_string(config_string: str, extra_config_path: str | None = None, ignore_local_config: bool = False, overrides: Dict[str, Any] | None = None, plugin_manager: PluginManager | None = None) FluffConfig

Loads a config object from a single config string.

classmethod from_strings(*config_strings: str, extra_config_path: str | None = None, ignore_local_config: bool = False, overrides: Dict[str, Any] | None = None, plugin_manager: PluginManager | None = None) FluffConfig

Loads a config object given a series of nested config strings.

Config strings are incorporated from first to last, treating the first element as the “root” config, and then later config strings will take precedence over any earlier values.

get(val: str, section: str | Iterable[str] = 'core', default: Any = None) Any

Get a particular value from the config.

get_section(section: str | Iterable[str]) Any

Return a whole section of config as a dict.

If the element found at the address is a value and not a section, it is still returned and so this can be used as a more advanced from of the basic get method.

Parameters:

section – An iterable or string. If it’s a string we load that root section. If it’s an iterable of strings, then we treat it as a path within the dictionary structure.

get_templater(**kwargs: Any) RawTemplater

Instantiate the configured templater.

get_templater_class() Type['RawTemplater']

Get the configured templater class.

NOTE: This is mostly useful to call directly when rules want to determine the type of a templater without (in particular to work out if it’s a derivative of the jinja templater), without needing to instantiate a full templater. Instantiated templaters don’t pickle well, so aren’t automatically passed around between threads/processes.

iter_vals(cfg: Dict[str, Any] | None = None) Iterable[Tuple[Any, ...]]

Return an iterable of tuples representing keys.

We show values before dicts, the tuple contains an indent value to know what level of the dict we’re in. Dict labels will be returned as a blank value before their content.

make_child_from_path(path: str) FluffConfig

Make a child config at a path but pass on overrides and extra_config_path.

process_inline_config(config_line: str, fname: str) None

Process an inline config command and update self.

process_raw_file_for_config(raw_str: str, fname: str) None

Process a full raw file for inline config and update self.

set_value(config_path: Iterable[str], val: Any) None

Set a value at a given path.

verify_dialect_specified() None

Check if the config specifies a dialect, raising an error if not.

class Lexer(config: FluffConfig | None = None, last_resort_lexer: StringLexer | None = None, dialect: str | None = None)

The Lexer class actually does the lexing step.

elements_to_segments(elements: List[TemplateElement], templated_file: TemplatedFile) Tuple[RawSegment, ...]

Convert a tuple of lexed elements into a tuple of segments.

lex(raw: str | TemplatedFile) Tuple[Tuple[BaseSegment, ...], List[SQLLexError]]

Take a string or TemplatedFile and return segments.

If we fail to match the whole string, then we must have found something that we cannot lex. If that happens we should package it up as unlexable and keep track of the exceptions.

static lex_match(forward_string: str, lexer_matchers: List[StringLexer]) LexMatch

Iteratively match strings using the selection of submatchers.

static map_template_slices(elements: List[LexedElement], template: TemplatedFile) List[TemplateElement]

Create a tuple of TemplateElement from a tuple of LexedElement.

This adds slices in the templated file to the original lexed elements. We’ll need this to work out the position in the source file.

static violations_from_segments(segments: Tuple[RawSegment, ...]) List[SQLLexError]

Generate any lexing errors for any unlexables.

class Linter(config: FluffConfig | None = None, formatter: Any = None, dialect: str | None = None, rules: List[str] | None = None, user_rules: List[Type[BaseRule]] | None = None, exclude_rules: List[str] | None = None)

The interface class to interact with the linter.

classmethod allowed_rule_ref_map(reference_map: Dict[str, Set[str]], disable_noqa_except: str | None) Dict[str, Set[str]]

Generate a noqa rule reference map.

fix(tree: BaseSegment, config: FluffConfig | None = None, fname: str | None = None, templated_file: TemplatedFile | None = None) Tuple[BaseSegment, List[SQLBaseError]]

Return the fixed tree and violations from lintfix when we’re fixing.

get_rulepack(config: FluffConfig | None = None) RulePack

Get hold of a set of rules.

lint(tree: BaseSegment, config: FluffConfig | None = None, fname: str | None = None, templated_file: TemplatedFile | None = None) List[SQLBaseError]

Return just the violations from lintfix when we’re only linting.

classmethod lint_fix_parsed(tree: BaseSegment, config: FluffConfig, rule_pack: RulePack, fix: bool = False, fname: str | None = None, templated_file: TemplatedFile | None = None, formatter: Any = None) Tuple[BaseSegment, List[SQLBaseError], IgnoreMask | None, List[Tuple[str, str, float]]]

Lint and optionally fix a tree object.

classmethod lint_parsed(parsed: ParsedString, rule_pack: RulePack, fix: bool = False, formatter: Any = None, encoding: str = 'utf8') LintedFile

Lint a ParsedString and return a LintedFile.

lint_path(path: str, fix: bool = False, ignore_non_existent_files: bool = False, ignore_files: bool = True, processes: int | None = None) LintedDir

Lint a path.

lint_paths(paths: Tuple[str, ...], fix: bool = False, ignore_non_existent_files: bool = False, ignore_files: bool = True, processes: int | None = None, apply_fixes: bool = False, fixed_file_suffix: str = '', fix_even_unparsable: bool = False, retain_files: bool = True) LintingResult

Lint an iterable of paths.

classmethod lint_rendered(rendered: RenderedFile, rule_pack: RulePack, fix: bool = False, formatter: Any = None) LintedFile

Take a RenderedFile and return a LintedFile.

lint_string(in_str: str = '', fname: str = '<string input>', fix: bool = False, config: FluffConfig | None = None, encoding: str = 'utf8') LintedFile

Lint a string.

Returns:

an object representing that linted file.

Return type:

LintedFile

lint_string_wrapped(string: str, fname: str = '<string input>', fix: bool = False) LintingResult

Lint strings directly.

static load_raw_file_and_config(fname: str, root_config: FluffConfig) Tuple[str, FluffConfig, str]

Load a raw file and the associated config.

parse_path(path: str, parse_statistics: bool = False) Iterator[ParsedString]

Parse a path of sql files.

NB: This a generator which will yield the result of each file within the path iteratively.

classmethod parse_rendered(rendered: RenderedFile, parse_statistics: bool = False) ParsedString

Parse a rendered file.

parse_string(in_str: str, fname: str = '<string>', config: FluffConfig | None = None, encoding: str = 'utf-8', parse_statistics: bool = False) ParsedString

Parse a string.

static remove_templated_errors(linting_errors: List[SQLBaseError]) List[SQLBaseError]

Filter a list of lint errors, removing those from the templated slices.

render_file(fname: str, root_config: FluffConfig) RenderedFile

Load and render a file with relevant config.

render_string(in_str: str, fname: str, config: FluffConfig, encoding: str) RenderedFile

Template the file.

rule_tuples() List[RuleTuple]

A simple pass through to access the rule tuples of the rule set.

class Parser(config: FluffConfig | None = None, dialect: str | None = None)

Instantiates parsed queries from a sequence of lexed raw segments.

parse(segments: Sequence[BaseSegment], fname: str | None = None, parse_statistics: bool = False) BaseSegment | None

Parse a series of lexed tokens using the current dialect.