Open
Description
Summary
motivation:
- Data EXPLORATION for NDJSON and CSV
- NDJSON By name misses the ability to treat a line as a variant
example
select $1,$2,$10 from @~/1.csv.gz;
----
$1 $2 $10
100 small <null>
200 lamb co. <null>
whole picture
copy | select (transform may add a cast after projection) |
|
---|---|---|
parquet [name] default or [pos] |
1. load 2. reorder cols if by name 3. cast (impl base on select) |
1. load |
csv[pos] | 1. decode(dst schema) | 1. decode (strings), tolerant bad CSV |
ndjson[name] default | 1. decode(dst schema) | 1. decode(variants) |
ndjson[pos] | 1. decode (1 variant) 2. cast |
1. decode (1 variant) only allow $1 |
MATCH_COLUMN_BY_NAME
MATCH_COLUMN_BY_NAME = CASE_SENSITIVE | CASE_INSENSITIVE | NONE
NONE means by pos
default value
- copy (without transform): based on file type:
- Parquet/NDJson/XML: CASE_SENSITIVE
- TSV/CSV: NONE
- Select: based on select type:
select $1, $2 ...
: Noneselect id, age, ...
: CASE_SENSITIVE- use settings:
- unquoted_ident_case_sensitive: default 0
- quoted_ident_case_sensitive: default 1
- use settings:
- Select *
- Only support parquet
Tasks
- support $1, $2add MATCH_COLUMN_BY_NAMEby POS
- select CSV
- Copy/select NDJSON
- Copy/select Parquet (optional)
by NAME- Select JSON
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
[-]Feature: select/load_with_tranform for CSV/JS[/-][+]Feature: copy/select from stage by pos[/+]$<col_position>
#11585