Python API Reference¶
SQLFluff exposes a public api for other python applications to use. A basic example of this usage is given here, with the documentation for each of the methods below.
"""This is an example of how to use the simple sqlfluff api."""
from typing import Any, Dict, Iterator, List, Union
import sqlfluff
# -------- LINTING ----------
my_bad_query = "SeLEct *, 1, blah as fOO from mySchema.myTable"
# Lint the given string and return an array of violations in JSON representation.
lint_result = sqlfluff.lint(my_bad_query, dialect="bigquery")
# lint_result =
# [
# {
# "code": "CP01",
# "line_no": 1,
# "line_pos": 1,
# "description": "Keywords must be consistently upper case.",
# }
# ...
# ]
# -------- FIXING ----------
# Fix the given string and get a string back which has been fixed.
fix_result_1 = sqlfluff.fix(my_bad_query, dialect="bigquery")
# fix_result_1 = 'SELECT *, 1, blah AS foo FROM myschema.mytable\n'
# We can also fix just specific rules.
fix_result_2 = sqlfluff.fix(my_bad_query, rules=["CP01"])
# fix_result_2 = 'SELECT *, 1, blah AS fOO FROM mySchema.myTable'
# Or a subset of rules...
fix_result_3 = sqlfluff.fix(my_bad_query, rules=["CP01", "CP02"])
# fix_result_3 = 'SELECT *, 1, blah AS fOO FROM myschema.mytable'
# -------- PARSING ----------
# Parse the given string and return a JSON representation of the parsed tree.
parse_result = sqlfluff.parse(my_bad_query)
# parse_result = {'file': {'statement': {...}, 'newline': '\n'}}
# This JSON structure can then be parsed as required.
# An example usage is shown below:
def get_json_segment(
parse_result: Dict[str, Any], segment_type: str
) -> Iterator[Union[str, Dict[str, Any], List[Dict[str, Any]]]]:
"""Recursively search JSON parse result for specified segment type.
Args:
parse_result (Dict[str, Any]): JSON parse result from `sqlfluff.fix`.
segment_type (str): The segment type to search for.
Yields:
Iterator[Union[str, Dict[str, Any], List[Dict[str, Any]]]]:
Retrieves children of specified segment type as either a string for a raw
segment or as JSON or an array of JSON for non-raw segments.
"""
for k, v in parse_result.items():
if k == segment_type:
yield v
elif isinstance(v, dict):
yield from get_json_segment(v, segment_type)
elif isinstance(v, list):
for s in v:
yield from get_json_segment(s, segment_type)
# e.g. Retrieve array of JSON for table references.
table_references = list(get_json_segment(parse_result, "table_reference"))
print(table_references)
# [[{'identifier': 'mySchema'}, {'dot': '.'}, {'identifier': 'myTable'}]]
# Retrieve raw table name from last identifier in the table reference.
for table_reference in table_references:
table_name = list(get_json_segment(parse_result, "naked_identifier"))[-1]
print(f"table_name: {table_name}")
# table_name: myTable
Simple API commands¶
Sqlfluff is a SQL linter for humans.
- fix(sql: str, dialect: str = 'ansi', rules: List[str] | None = None, exclude_rules: List[str] | None = None, config: FluffConfig | None = None, config_path: str | None = None, fix_even_unparsable: bool | None = None) str ¶
Fix a SQL string.
- Parameters:
sql (
str
) – The SQL to be fixed.dialect (
str
, optional) – A reference to the dialect of the SQL to be fixed. Defaults to ansi.rules (
Optional[List[str]
, optional) – A subset of rule references to fix for. Defaults to None.exclude_rules (
Optional[List[str]
, optional) – A subset of rule references to avoid fixing for. Defaults to None.config (
Optional[FluffConfig]
, optional) – A configuration object to use for the operation. Defaults to None.config_path (
Optional[str]
, optional) – A path to a .sqlfluff config, which is only used if a config is not already provided. Defaults to None.fix_even_unparsable (
bool
, optional) – Optional override for the corresponding SQLFluff configuration value.
- Returns:
str
for the fixed SQL if possible.
- lint(sql: str, dialect: str = 'ansi', rules: List[str] | None = None, exclude_rules: List[str] | None = None, config: FluffConfig | None = None, config_path: str | None = None) List[Dict[str, Any]] ¶
Lint a SQL string.
- Parameters:
sql (
str
) – The SQL to be linted.dialect (
str
, optional) – A reference to the dialect of the SQL to be linted. Defaults to ansi.rules (
Optional[List[str]
, optional) – A list of rule references to lint for. Defaults to None.exclude_rules (
Optional[List[str]
, optional) – A list of rule references to avoid linting for. Defaults to None.config (
Optional[FluffConfig]
, optional) – A configuration object to use for the operation. Defaults to None.config_path (
Optional[str]
, optional) – A path to a .sqlfluff config, which is only used if a config is not already provided. Defaults to None.
- Returns:
List[Dict[str, Any]]
for each violation found.
- parse(sql: str, dialect: str = 'ansi', config: FluffConfig | None = None, config_path: str | None = None) Dict[str, Any] ¶
Parse a SQL string.
- Parameters:
sql (
str
) – The SQL to be parsed.dialect (
str
, optional) – A reference to the dialect of the SQL to be parsed. Defaults to ansi.config (
Optional[FluffConfig]
, optional) – A configuration object to use for the operation. Defaults to None.config_path (
Optional[str]
, optional) – A path to a .sqlfluff config, which is only used if a config is not already provided. Defaults to None.
- Returns:
Dict[str, Any]
JSON containing the parsed structure.
Note
In the case of multiple potential variants from the raw source file only the first variant is returned by the simple API. For access to the other variants, use the underlying main API directly.
Advanced API usage¶
The simple API presents only a fraction of the functionality present
within the core SQLFluff library. For more advanced use cases, users
can import the Linter()
and FluffConfig()
classes from
sqlfluff.core
. As of version 0.4.0 this is considered as
experimental only as the internals may change without warning in any
future release. If you come to rely on the internals of SQLFluff, please
post an issue on GitHub to share what you’re up to. This will help shape
a more reliable, tidy and well documented public API for use.
Configuring SQLFluff¶
You can use FluffConfig()
class to configure SQLFluff behaviour.
"""This is an example of providing config overrides."""
from sqlfluff.core import FluffConfig, Linter
sql = "SELECT 1\n"
config = FluffConfig(
overrides={
"dialect": "snowflake",
# NOTE: We explicitly set the string "none" here rather
# than a None literal so that it overrides any config
# set by any config files in the path.
"library_path": "none",
}
)
linted_file = Linter(config=config).lint_string(sql)
assert linted_file.get_violations() == []
Instances of FluffConfig()
can be created manually, or parsed.
"""An example to show a few ways of configuring the API."""
import sqlfluff
from sqlfluff.core import FluffConfig, Linter
# #######################################
# The simple API can be configured in three ways.
# 1. Limited keyword arguments
sqlfluff.fix("SELECT 1", dialect="bigquery")
# 2. Providing the path to a config file
sqlfluff.fix("SELECT 1", config_path="test/fixtures/.sqlfluff")
# 3. Providing a preconfigured FluffConfig object.
# NOTE: This is the way of configuring SQLFluff which will give the most control.
# 3a. FluffConfig objects can be created directly from a dictionary of values.
config = FluffConfig(configs={"core": {"dialect": "bigquery"}})
# 3b. FluffConfig objects can be created from a config file in a string.
config = FluffConfig.from_string("[sqlfluff]\ndialect=bigquery\n")
# 3c. FluffConfig objects can be created from a config file in multiple strings
# to simulate the effect of multiple nested config strings.
config = FluffConfig.from_strings(
# NOTE: Given these two strings, the resulting dialect would be "mysql"
# as the later files take precedence.
"[sqlfluff]\ndialect=bigquery\n",
"[sqlfluff]\ndialect=mysql\n",
)
# 3d. FluffConfig objects can be created from a path containing a config file.
config = FluffConfig.from_path("test/fixtures/")
# 3e. FluffConfig objects can be from keyword arguments
config = FluffConfig.from_kwargs(dialect="bigquery", rules=["LT01"])
# The FluffConfig is then provided via a config argument.
sqlfluff.fix("SELECT 1", config=config)
# #######################################
# The core API is always configured using a FluffConfig object.
# When instantiating a Linter (or Parser), a FluffConfig must be provided
# on instantiation. See above for details on how to create a FluffConfig.
linter = Linter(config=config)
# The provided config will then be used in any operations.
lint_result = linter.lint_string("SELECT 1", fix=True)
fixed_string = lint_result.fix_string()
# NOTE: The "True" element shows that fixing was a success.
assert fixed_string == ("SELECT 1", True)
Supported dialects and rules are available through list_dialects()
and list_rules()
.
"""This is an example of how get basic options from sqlfluff."""
import sqlfluff
# -------- DIALECTS ----------
dialects = sqlfluff.list_dialects()
# dialects = [DialectTuple(label='ansi', name='ansi', inherits_from='nothing'), ...]
dialect_names = [dialect.label for dialect in dialects]
# dialect_names = ["ansi", "snowflake", ...]
# -------- RULES ----------
rules = sqlfluff.list_rules()
# rules = [
# RuleTuple(
# code='Example_LT01',
# description='ORDER BY on these columns is forbidden!'
# ),
# ...
# ]
rule_codes = [rule.code for rule in rules]
# rule_codes = ["LT01", "LT02", ...]
Advanced API reference¶
The core elements of sqlfluff.
- class FluffConfig(configs: ConfigMappingType | None = None, extra_config_path: str | None = None, ignore_local_config: bool = False, overrides: ConfigMappingType | None = None, plugin_manager: pluggy.PluginManager | None = None, require_dialect: bool = True)¶
The class that actually gets passed around as a config object.
- diff_to(other: FluffConfig) Dict[str, Any] ¶
Compare this config to another.
- Parameters:
other (
FluffConfig
) – Another config object to compare against. We will return keys from this object that are not in other or are different to those in other.- Returns:
A filtered dict of items in this config that are not in the other or are different to the other.
- classmethod from_kwargs(config: FluffConfig | None = None, dialect: str | None = None, rules: List[str] | None = None, exclude_rules: List[str] | None = None, require_dialect: bool = True) FluffConfig ¶
Instantiate a config from either an existing config or kwargs.
This is a convenience method for the ways that the public classes like Linter(), Parser() and Lexer() can be instantiated with a FluffConfig or with the convenience kwargs: dialect & rules.
- classmethod from_path(path: str, extra_config_path: str | None = None, ignore_local_config: bool = False, overrides: ConfigMappingType | None = None, plugin_manager: pluggy.PluginManager | None = None) FluffConfig ¶
Loads a config object given a particular path.
- classmethod from_root(extra_config_path: str | None = None, ignore_local_config: bool = False, overrides: Dict[str, Any] | None = None, **kw: Any) FluffConfig ¶
Loads a config object just based on the root directory.
- classmethod from_string(config_string: str, extra_config_path: str | None = None, ignore_local_config: bool = False, overrides: Dict[str, Any] | None = None, plugin_manager: PluginManager | None = None) FluffConfig ¶
Loads a config object from a single config string.
- classmethod from_strings(*config_strings: str, extra_config_path: str | None = None, ignore_local_config: bool = False, overrides: Dict[str, Any] | None = None, plugin_manager: PluginManager | None = None) FluffConfig ¶
Loads a config object given a series of nested config strings.
Config strings are incorporated from first to last, treating the first element as the “root” config, and then later config strings will take precedence over any earlier values.
- get(val: str, section: str | Iterable[str] = 'core', default: Any = None) Any ¶
Get a particular value from the config.
- get_section(section: str | Iterable[str]) Any ¶
Return a whole section of config as a dict.
If the element found at the address is a value and not a section, it is still returned and so this can be used as a more advanced from of the basic get method.
- Parameters:
section – An iterable or string. If it’s a string we load that root section. If it’s an iterable of strings, then we treat it as a path within the dictionary structure.
- get_templater(templater_name: str = 'jinja', **kwargs: Any) RawTemplater ¶
Fetch a templater by name.
- iter_vals(cfg: Dict[str, Any] | None = None) Iterable[Tuple[Any, ...]] ¶
Return an iterable of tuples representing keys.
We show values before dicts, the tuple contains an indent value to know what level of the dict we’re in. Dict labels will be returned as a blank value before their content.
- make_child_from_path(path: str) FluffConfig ¶
Make a child config at a path but pass on overrides and extra_config_path.
- process_inline_config(config_line: str, fname: str) None ¶
Process an inline config command and update self.
- process_raw_file_for_config(raw_str: str, fname: str) None ¶
Process a full raw file for inline config and update self.
- set_value(config_path: Iterable[str], val: Any) None ¶
Set a value at a given path.
- verify_dialect_specified() None ¶
Check if the config specifies a dialect, raising an error if not.
- class Lexer(config: FluffConfig | None = None, last_resort_lexer: StringLexer | None = None, dialect: str | None = None)¶
The Lexer class actually does the lexing step.
- elements_to_segments(elements: List[TemplateElement], templated_file: TemplatedFile) Tuple[RawSegment, ...] ¶
Convert a tuple of lexed elements into a tuple of segments.
- lex(raw: str | TemplatedFile) Tuple[Tuple[BaseSegment, ...], List[SQLLexError]] ¶
Take a string or TemplatedFile and return segments.
If we fail to match the whole string, then we must have found something that we cannot lex. If that happens we should package it up as unlexable and keep track of the exceptions.
- static lex_match(forward_string: str, lexer_matchers: List[StringLexer]) LexMatch ¶
Iteratively match strings using the selection of submatchers.
- static map_template_slices(elements: List[LexedElement], template: TemplatedFile) List[TemplateElement] ¶
Create a tuple of TemplateElement from a tuple of LexedElement.
This adds slices in the templated file to the original lexed elements. We’ll need this to work out the position in the source file.
- static violations_from_segments(segments: Tuple[RawSegment, ...]) List[SQLLexError] ¶
Generate any lexing errors for any unlexables.
- class Linter(config: FluffConfig | None = None, formatter: Any = None, dialect: str | None = None, rules: List[str] | None = None, user_rules: List[Type[BaseRule]] | None = None, exclude_rules: List[str] | None = None)¶
The interface class to interact with the linter.
- classmethod allowed_rule_ref_map(reference_map: Dict[str, Set[str]], disable_noqa_except: str | None) Dict[str, Set[str]] ¶
Generate a noqa rule reference map.
- fix(tree: BaseSegment, config: FluffConfig | None = None, fname: str | None = None, templated_file: TemplatedFile | None = None) Tuple[BaseSegment, List[SQLBaseError]] ¶
Return the fixed tree and violations from lintfix when we’re fixing.
- get_rulepack(config: FluffConfig | None = None) RulePack ¶
Get hold of a set of rules.
- lint(tree: BaseSegment, config: FluffConfig | None = None, fname: str | None = None, templated_file: TemplatedFile | None = None) List[SQLBaseError] ¶
Return just the violations from lintfix when we’re only linting.
- classmethod lint_fix_parsed(tree: BaseSegment, config: FluffConfig, rule_pack: RulePack, fix: bool = False, fname: str | None = None, templated_file: TemplatedFile | None = None, formatter: Any = None) Tuple[BaseSegment, List[SQLBaseError], IgnoreMask | None, List[Tuple[str, str, float]]] ¶
Lint and optionally fix a tree object.
- classmethod lint_parsed(parsed: ParsedString, rule_pack: RulePack, fix: bool = False, formatter: Any = None, encoding: str = 'utf8') LintedFile ¶
Lint a ParsedString and return a LintedFile.
- lint_path(path: str, fix: bool = False, ignore_non_existent_files: bool = False, ignore_files: bool = True, processes: int | None = None) LintedDir ¶
Lint a path.
- lint_paths(paths: Tuple[str, ...], fix: bool = False, ignore_non_existent_files: bool = False, ignore_files: bool = True, processes: int | None = None, apply_fixes: bool = False, fixed_file_suffix: str = '', fix_even_unparsable: bool = False, retain_files: bool = True) LintingResult ¶
Lint an iterable of paths.
- classmethod lint_rendered(rendered: RenderedFile, rule_pack: RulePack, fix: bool = False, formatter: Any = None) LintedFile ¶
Take a RenderedFile and return a LintedFile.
- lint_string(in_str: str = '', fname: str = '<string input>', fix: bool = False, config: FluffConfig | None = None, encoding: str = 'utf8') LintedFile ¶
Lint a string.
- Returns:
an object representing that linted file.
- Return type:
LintedFile
- lint_string_wrapped(string: str, fname: str = '<string input>', fix: bool = False) LintingResult ¶
Lint strings directly.
- static load_raw_file_and_config(fname: str, root_config: FluffConfig) Tuple[str, FluffConfig, str] ¶
Load a raw file and the associated config.
- parse_path(path: str, parse_statistics: bool = False) Iterator[ParsedString] ¶
Parse a path of sql files.
NB: This a generator which will yield the result of each file within the path iteratively.
- classmethod parse_rendered(rendered: RenderedFile, parse_statistics: bool = False) ParsedString ¶
Parse a rendered file.
- parse_string(in_str: str, fname: str = '<string>', config: FluffConfig | None = None, encoding: str = 'utf-8', parse_statistics: bool = False) ParsedString ¶
Parse a string.
- static remove_templated_errors(linting_errors: List[SQLBaseError]) List[SQLBaseError] ¶
Filter a list of lint errors, removing those from the templated slices.
- render_file(fname: str, root_config: FluffConfig) RenderedFile ¶
Load and render a file with relevant config.
- render_string(in_str: str, fname: str, config: FluffConfig, encoding: str) RenderedFile ¶
Template the file.
- rule_tuples() List[RuleTuple] ¶
A simple pass through to access the rule tuples of the rule set.
- class Parser(config: FluffConfig | None = None, dialect: str | None = None)¶
Instantiates parsed queries from a sequence of lexed raw segments.
- parse(segments: Sequence[BaseSegment], fname: str | None = None, parse_statistics: bool = False) BaseSegment | None ¶
Parse a series of lexed tokens using the current dialect.