json-and-csv-data-transformation
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseJSON and CSV Data Transformation
JSON与CSV数据转换
Transform data between JSON, CSV, and other formats. Filter, map, flatten nested objects, and reshape data for analysis, reporting, and API integration.
在JSON、CSV和其他格式之间转换数据,支持过滤、映射、扁平化嵌套对象,可重塑数据结构以满足分析、报表和API集成需求。
When to use
适用场景
- Use case 1: When the user asks to convert data between JSON and CSV formats
- Use case 2: When you need to filter, extract, or transform specific fields from data
- Use case 3: For flattening nested JSON structures into tabular format
- Use case 4: When processing API responses for analysis or reporting
- 场景1:当用户需要在JSON和CSV格式之间转换数据时
- 场景2:当你需要过滤、提取或转换数据中的特定字段时
- 场景3:将嵌套的JSON结构扁平化为表格格式
- 场景4:处理API响应用于分析或生成报表时
Required tools / APIs
所需工具/API
- jq — Command-line JSON processor (essential for JSON manipulation)
- csvkit — Suite of CSV tools (csvjson, csvcut, csvgrep, etc.)
- No external API required
Install options:
bash
undefined- jq — 命令行JSON处理器(JSON操作的核心工具)
- csvkit — CSV工具套件(包含csvjson、csvcut、csvgrep等工具)
- 无需调用外部API
安装选项:
bash
undefinedUbuntu/Debian
Ubuntu/Debian
sudo apt-get install -y jq csvkit
sudo apt-get install -y jq csvkit
macOS
macOS
brew install jq csvkit
brew install jq csvkit
Node.js (native support, no packages needed for basic operations)
Node.js(原生支持,基础操作无需安装额外包)
For advanced CSV parsing: npm install csv-parse csv-stringify
高级CSV解析需要安装:npm install csv-parse csv-stringify
undefinedundefinedSkills
功能技能
json_to_csv
json_to_csv
Convert JSON array to CSV format.
bash
undefined将JSON数组转换为CSV格式。
bash
undefinedSimple JSON array to CSV
简单JSON数组转CSV
echo '[{"name":"Alice","age":30},{"name":"Bob","age":25}]' | jq -r '(.[0] | keys_unsorted) as $keys | $keys, (map([.[ $keys[] ]]) | .[] | @csv)'
echo '[{"name":"Alice","age":30},{"name":"Bob","age":25}]' | jq -r '(.[0] | keys_unsorted) as $keys | $keys, (map([.[ $keys[] ]]) | .[] | @csv)'
JSON file to CSV file
JSON文件转CSV文件
jq -r '(.[0] | keys_unsorted) as $keys | $keys, (map([.[ $keys[] ]]) | .[] | @csv)' data.json > output.csv
jq -r '(.[0] | keys_unsorted) as $keys | $keys, (map([.[ $keys[] ]]) | .[] | @csv)' data.json > output.csv
JSON to CSV with specific fields
JSON转CSV并指定输出字段
jq -r '.[] | [.id, .name, .email] | @csv' users.json
jq -r '.[] | [.id, .name, .email] | @csv' users.json
Using csvkit (simpler syntax)
使用csvkit实现(语法更简单)
cat data.json | in2csv -f json > output.csv
**Node.js:**
```javascript
function jsonToCSV(jsonArray) {
if (!Array.isArray(jsonArray) || jsonArray.length === 0) {
return '';
}
// Get headers from first object
const headers = Object.keys(jsonArray[0]);
// Escape CSV values
const escape = (val) => {
if (val === null || val === undefined) return '';
const str = String(val);
if (str.includes(',') || str.includes('"') || str.includes('\n')) {
return `"${str.replace(/"/g, '""')}"`;
}
return str;
};
// Build CSV
const headerRow = headers.join(',');
const dataRows = jsonArray.map(obj =>
headers.map(header => escape(obj[header])).join(',')
);
return [headerRow, ...dataRows].join('\n');
}
// Usage
// const data = [
// { name: 'Alice', age: 30, city: 'New York' },
// { name: 'Bob', age: 25, city: 'San Francisco' }
// ];
// console.log(jsonToCSV(data));cat data.json | in2csv -f json > output.csv
**Node.js实现:**
```javascript
function jsonToCSV(jsonArray) {
if (!Array.isArray(jsonArray) || jsonArray.length === 0) {
return '';
}
// 从第一个对象获取表头
const headers = Object.keys(jsonArray[0]);
// 转义CSV特殊值
const escape = (val) => {
if (val === null || val === undefined) return '';
const str = String(val);
if (str.includes(',') || str.includes('"') || str.includes('\n')) {
return `"${str.replace(/"/g, '""')}"`;
}
return str;
};
// 构建CSV内容
const headerRow = headers.join(',');
const dataRows = jsonArray.map(obj =>
headers.map(header => escape(obj[header])).join(',')
);
return [headerRow, ...dataRows].join('\n');
}
// 使用示例
// const data = [
// { name: 'Alice', age: 30, city: 'New York' },
// { name: 'Bob', age: 25, city: 'San Francisco' }
// ];
// console.log(jsonToCSV(data));csv_to_json
csv_to_json
Convert CSV to JSON array.
bash
undefined将CSV转换为JSON数组。
bash
undefinedCSV to JSON
CSV转JSON
csvjson data.csv
csvjson data.csv
CSV to JSON with pretty printing
CSV转JSON并格式化输出
csvjson data.csv | jq '.'
csvjson data.csv | jq '.'
CSV to JSON array of objects
CSV转JSON对象数组
csvjson --stream data.csv
csvjson --stream data.csv
CSV file to JSON file
CSV文件转JSON文件
csvjson input.csv > output.json
csvjson input.csv > output.json
Using pure jq (if headers are in first row)
纯jq实现(假设首行是表头)
jq -Rsn '[inputs | split(",") | {name: .[0], age: .[1], city: .[2]}]' < data.csv
**Node.js:**
```javascript
function csvToJSON(csvString) {
const lines = csvString.trim().split('\n');
if (lines.length < 2) return [];
// Parse CSV value (handle quotes)
const parseCSVValue = (val) => {
val = val.trim();
if (val.startsWith('"') && val.endsWith('"')) {
return val.slice(1, -1).replace(/""/g, '"');
}
return val;
};
// Split CSV line (basic implementation)
const splitCSVLine = (line) => {
const result = [];
let current = '';
let inQuotes = false;
for (let i = 0; i < line.length; i++) {
const char = line[i];
if (char === '"') {
inQuotes = !inQuotes;
current += char;
} else if (char === ',' && !inQuotes) {
result.push(parseCSVValue(current));
current = '';
} else {
current += char;
}
}
result.push(parseCSVValue(current));
return result;
};
const headers = splitCSVLine(lines[0]);
const data = lines.slice(1).map(line => {
const values = splitCSVLine(line);
const obj = {};
headers.forEach((header, i) => {
obj[header] = values[i] || '';
});
return obj;
});
return data;
}
// Usage
// const csv = `name,age,city
// Alice,30,New York
// Bob,25,"San Francisco"`;
// console.log(JSON.stringify(csvToJSON(csv), null, 2));jq -Rsn '[inputs | split(",") | {name: .[0], age: .[1], city: .[2]}]' < data.csv
**Node.js实现:**
```javascript
function csvToJSON(csvString) {
const lines = csvString.trim().split('\n');
if (lines.length < 2) return [];
// 解析CSV值(处理引号)
const parseCSVValue = (val) => {
val = val.trim();
if (val.startsWith('"') && val.endsWith('"')) {
return val.slice(1, -1).replace(/""/g, '"');
}
return val;
};
// 拆分CSV行(基础实现)
const splitCSVLine = (line) => {
const result = [];
let current = '';
let inQuotes = false;
for (let i = 0; i < line.length; i++) {
const char = line[i];
if (char === '"') {
inQuotes = !inQuotes;
current += char;
} else if (char === ',' && !inQuotes) {
result.push(parseCSVValue(current));
current = '';
} else {
current += char;
}
}
result.push(parseCSVValue(current));
return result;
};
const headers = splitCSVLine(lines[0]);
const data = lines.slice(1).map(line => {
const values = splitCSVLine(line);
const obj = {};
headers.forEach((header, i) => {
obj[header] = values[i] || '';
});
return obj;
});
return data;
}
// 使用示例
// const csv = `name,age,city
// Alice,30,New York
// Bob,25,"San Francisco"`;
// console.log(JSON.stringify(csvToJSON(csv), null, 2));filter_and_extract_json
filter_and_extract_json
Filter and extract specific fields from JSON.
bash
undefined过滤并提取JSON中的特定字段。
bash
undefinedExtract specific fields
提取特定字段
jq '.[] | {name: .name, email: .email}' users.json
jq '.[] | {name: .name, email: .email}' users.json
Filter by condition
按条件过滤
jq '.[] | select(.age > 25)' users.json
jq '.[] | select(.age > 25)' users.json
Filter and extract
过滤并提取
jq '[.[] | select(.active == true) | {id: .id, name: .name}]' data.json
jq '[.[] | select(.active == true) | {id: .id, name: .name}]' data.json
Extract nested fields
提取嵌套字段
jq '.[] | {name: .name, street: .address.street, city: .address.city}' data.json
jq '.[] | {name: .name, street: .address.street, city: .address.city}' data.json
Get array of single field
获取单个字段组成的数组
jq '.[].name' users.json
jq '.[].name' users.json
Filter with multiple conditions
多条件过滤
jq '.[] | select(.age > 20 and .country == "USA")' users.json
jq '.[] | select(.age > 20 and .country == "USA")' users.json
Map and transform values
映射并转换值
jq '.[] | .price = (.price * 1.1)' products.json
**Node.js:**
```javascript
function filterAndExtractJSON(data, options) {
const { filter, extract } = options;
let result = Array.isArray(data) ? data : [data];
// Apply filter function
if (filter) {
result = result.filter(filter);
}
// Extract specific fields
if (extract) {
result = result.map(item => {
const extracted = {};
extract.forEach(field => {
// Support nested fields with dot notation
const value = field.split('.').reduce((obj, key) => obj?.[key], item);
extracted[field] = value;
});
return extracted;
});
}
return result;
}
// Usage
// const users = [
// { id: 1, name: 'Alice', age: 30, address: { city: 'NYC' } },
// { id: 2, name: 'Bob', age: 25, address: { city: 'SF' } },
// { id: 3, name: 'Charlie', age: 35, address: { city: 'LA' } }
// ];
//
// const result = filterAndExtractJSON(users, {
// filter: user => user.age > 25,
// extract: ['name', 'age', 'address.city']
// });
// console.log(result);jq '.[] | .price = (.price * 1.1)' products.json
**Node.js实现:**
```javascript
function filterAndExtractJSON(data, options) {
const { filter, extract } = options;
let result = Array.isArray(data) ? data : [data];
// 应用过滤函数
if (filter) {
result = result.filter(filter);
}
// 提取特定字段
if (extract) {
result = result.map(item => {
const extracted = {};
extract.forEach(field => {
// 支持点分隔的嵌套字段
const value = field.split('.').reduce((obj, key) => obj?.[key], item);
extracted[field] = value;
});
return extracted;
});
}
return result;
}
// 使用示例
// const users = [
// { id: 1, name: 'Alice', age: 30, address: { city: 'NYC' } },
// { id: 2, name: 'Bob', age: 25, address: { city: 'SF' } },
// { id: 3, name: 'Charlie', age: 35, address: { city: 'LA' } }
// ];
//
// const result = filterAndExtractJSON(users, {
// filter: user => user.age > 25,
// extract: ['name', 'age', 'address.city']
// });
// console.log(result);flatten_nested_json
flatten_nested_json
Flatten nested JSON objects into flat structure.
bash
undefined将嵌套的JSON对象扁平化为单层结构。
bash
undefinedFlatten nested JSON with jq
使用jq扁平化嵌套JSON
jq '[.[] | {id: .id, name: .name, street: .address.street, city: .address.city, zip: .address.zip}]' users.json
jq '[.[] | {id: .id, name: .name, street: .address.street, city: .address.city, zip: .address.zip}]' users.json
Flatten all nested fields with custom separator
使用自定义分隔符扁平化所有嵌套字段
jq '[.[] | to_entries | map({key: .key, value: .value}) | from_entries]' data.json
jq '[.[] | to_entries | map({key: .key, value: .value}) | from_entries]' data.json
Flatten deeply nested structure
扁平化深度嵌套结构
jq 'recurse | select(type != "object" and type != "array")' complex.json
**Node.js:**
```javascript
function flattenJSON(obj, prefix = '', separator = '.') {
const flattened = {};
for (const key in obj) {
const value = obj[key];
const newKey = prefix ? `${prefix}${separator}${key}` : key;
if (value !== null && typeof value === 'object' && !Array.isArray(value)) {
// Recursively flatten nested objects
Object.assign(flattened, flattenJSON(value, newKey, separator));
} else if (Array.isArray(value)) {
// Convert arrays to string or flatten each item
flattened[newKey] = JSON.stringify(value);
} else {
flattened[newKey] = value;
}
}
return flattened;
}
// Usage
// const nested = {
// id: 1,
// name: 'Alice',
// address: {
// street: '123 Main St',
// city: 'NYC',
// coordinates: { lat: 40.7, lon: -74.0 }
// },
// tags: ['user', 'active']
// };
// console.log(flattenJSON(nested));
// Output: {
// id: 1,
// name: 'Alice',
// 'address.street': '123 Main St',
// 'address.city': 'NYC',
// 'address.coordinates.lat': 40.7,
// 'address.coordinates.lon': -74.0,
// tags: '["user","active"]'
// }jq 'recurse | select(type != "object" and type != "array")' complex.json
**Node.js实现:**
```javascript
function flattenJSON(obj, prefix = '', separator = '.') {
const flattened = {};
for (const key in obj) {
const value = obj[key];
const newKey = prefix ? `${prefix}${separator}${key}` : key;
if (value !== null && typeof value === 'object' && !Array.isArray(value)) {
// 递归扁平化嵌套对象
Object.assign(flattened, flattenJSON(value, newKey, separator));
} else if (Array.isArray(value)) {
// 将数组转为字符串或扁平化每个元素
flattened[newKey] = JSON.stringify(value);
} else {
flattened[newKey] = value;
}
}
return flattened;
}
// 使用示例
// const nested = {
// id: 1,
// name: 'Alice',
// address: {
// street: '123 Main St',
// city: 'NYC',
// coordinates: { lat: 40.7, lon: -74.0 }
// },
// tags: ['user', 'active']
// };
// console.log(flattenJSON(nested));
// 输出: {
// id: 1,
// name: 'Alice',
// 'address.street': '123 Main St',
// 'address.city': 'NYC',
// 'address.coordinates.lat': 40.7,
// 'address.coordinates.lon': -74.0,
// tags: '["user","active"]'
// }transform_csv_data
transform_csv_data
Transform and manipulate CSV data.
bash
undefined转换和操作CSV数据。
bash
undefinedSelect specific columns
选择指定列
csvcut -c name,email,age users.csv
csvcut -c name,email,age users.csv
Filter rows by value
按值过滤行
csvgrep -c age -r "^[3-9][0-9]$" users.csv # age >= 30
csvgrep -c age -r "^[3-9][0-9]$" users.csv # 年龄 >= 30
Sort CSV
排序CSV
csvsort -c age -r users.csv # reverse sort by age
csvsort -c age -r users.csv # 按年龄倒序排序
Remove duplicate rows
去重行
csvcut -c name,email users.csv | uniq
csvcut -c name,email users.csv | uniq
Combine: filter, select columns, sort
组合操作:过滤、选列、排序
csvgrep -c country -m "USA" users.csv | csvcut -c name,age | csvsort -c age
csvgrep -c country -m "USA" users.csv | csvcut -c name,age | csvsort -c age
Add calculated column (requires csvpy or awk)
添加计算列(需要csvpy或awk)
awk -F',' 'BEGIN{OFS=","} NR==1{print $0,"total"} NR>1{print $0,$2*$3}' data.csv
awk -F',' 'BEGIN{OFS=","} NR==1{print $0,"total"} NR>1{print $0,$2*$3}' data.csv
Merge two CSV files by column
按列合并两个CSV文件
csvjoin -c id users.csv orders.csv
**Node.js:**
```javascript
function transformCSV(csvData, transformations) {
const { selectColumns, filterRows, sortBy } = transformations;
// Parse CSV to objects
const data = csvToJSON(csvData);
let result = data;
// Filter rows
if (filterRows) {
result = result.filter(filterRows);
}
// Select columns
if (selectColumns) {
result = result.map(row => {
const selected = {};
selectColumns.forEach(col => {
selected[col] = row[col];
});
return selected;
});
}
// Sort
if (sortBy) {
const { column, reverse } = sortBy;
result.sort((a, b) => {
const aVal = a[column];
const bVal = b[column];
const comparison = aVal > bVal ? 1 : aVal < bVal ? -1 : 0;
return reverse ? -comparison : comparison;
});
}
// Convert back to CSV
return jsonToCSV(result);
}
// Usage
// const csv = `name,age,country
// Alice,30,USA
// Bob,25,Canada
// Charlie,35,USA`;
//
// const transformed = transformCSV(csv, {
// filterRows: row => row.country === 'USA',
// selectColumns: ['name', 'age'],
// sortBy: { column: 'age', reverse: true }
// });
// console.log(transformed);csvjoin -c id users.csv orders.csv
**Node.js实现:**
```javascript
function transformCSV(csvData, transformations) {
const { selectColumns, filterRows, sortBy } = transformations;
// 将CSV解析为对象
const data = csvToJSON(csvData);
let result = data;
// 过滤行
if (filterRows) {
result = result.filter(filterRows);
}
// 选择列
if (selectColumns) {
result = result.map(row => {
const selected = {};
selectColumns.forEach(col => {
selected[col] = row[col];
});
return selected;
});
}
// 排序
if (sortBy) {
const { column, reverse } = sortBy;
result.sort((a, b) => {
const aVal = a[column];
const bVal = b[column];
const comparison = aVal > bVal ? 1 : aVal < bVal ? -1 : 0;
return reverse ? -comparison : comparison;
});
}
// 转换回CSV格式
return jsonToCSV(result);
}
// 使用示例
// const csv = `name,age,country
// Alice,30,USA
// Bob,25,Canada
// Charlie,35,USA`;
//
// const transformed = transformCSV(csv, {
// filterRows: row => row.country === 'USA',
// selectColumns: ['name', 'age'],
// sortBy: { column: 'age', reverse: true }
// });
// console.log(transformed);aggregate_and_group_json
aggregate_and_group_json
Aggregate and group JSON data (similar to SQL GROUP BY).
bash
undefined聚合和分组JSON数据(类似SQL的GROUP BY功能)。
bash
undefinedGroup by field and count
按字段分组并计数
jq 'group_by(.country) | map({country: .[0].country, count: length})' users.json
jq 'group_by(.country) | map({country: .[0].country, count: length})' users.json
Sum values by group
按分组求和
jq 'group_by(.category) | map({category: .[0].category, total: map(.price) | add})' products.json
jq 'group_by(.category) | map({category: .[0].category, total: map(.price) | add})' products.json
Average by group
按分组求平均值
jq 'group_by(.department) | map({department: .[0].department, avg_salary: (map(.salary) | add / length)})' employees.json
jq 'group_by(.department) | map({department: .[0].department, avg_salary: (map(.salary) | add / length)})' employees.json
Multiple aggregations
多维度聚合
jq 'group_by(.region) | map({
region: .[0].region,
count: length,
total_sales: map(.sales) | add,
avg_sales: (map(.sales) | add / length)
})' sales.json
**Node.js:**
```javascript
function groupAndAggregate(data, groupBy, aggregations) {
// Group data
const grouped = {};
data.forEach(item => {
const key = item[groupBy];
if (!grouped[key]) grouped[key] = [];
grouped[key].push(item);
});
// Apply aggregations
return Object.entries(grouped).map(([key, items]) => {
const result = { [groupBy]: key };
aggregations.forEach(agg => {
if (agg.type === 'count') {
result[agg.name] = items.length;
} else if (agg.type === 'sum') {
result[agg.name] = items.reduce((sum, item) => sum + (item[agg.field] || 0), 0);
} else if (agg.type === 'avg') {
const sum = items.reduce((s, item) => s + (item[agg.field] || 0), 0);
result[agg.name] = items.length > 0 ? sum / items.length : 0;
} else if (agg.type === 'min') {
result[agg.name] = Math.min(...items.map(item => item[agg.field] || Infinity));
} else if (agg.type === 'max') {
result[agg.name] = Math.max(...items.map(item => item[agg.field] || -Infinity));
}
});
return result;
});
}
// Usage
// const sales = [
// { region: 'East', product: 'A', amount: 100 },
// { region: 'East', product: 'B', amount: 200 },
// { region: 'West', product: 'A', amount: 150 },
// { region: 'West', product: 'B', amount: 250 }
// ];
//
// const result = groupAndAggregate(sales, 'region', [
// { name: 'count', type: 'count' },
// { name: 'total_amount', type: 'sum', field: 'amount' },
// { name: 'avg_amount', type: 'avg', field: 'amount' }
// ]);
// console.log(result);jq 'group_by(.region) | map({
region: .[0].region,
count: length,
total_sales: map(.sales) | add,
avg_sales: (map(.sales) | add / length)
})' sales.json
**Node.js实现:**
```javascript
function groupAndAggregate(data, groupBy, aggregations) {
// 分组数据
const grouped = {};
data.forEach(item => {
const key = item[groupBy];
if (!grouped[key]) grouped[key] = [];
grouped[key].push(item);
});
// 应用聚合规则
return Object.entries(grouped).map(([key, items]) => {
const result = { [groupBy]: key };
aggregations.forEach(agg => {
if (agg.type === 'count') {
result[agg.name] = items.length;
} else if (agg.type === 'sum') {
result[agg.name] = items.reduce((sum, item) => sum + (item[agg.field] || 0), 0);
} else if (agg.type === 'avg') {
const sum = items.reduce((s, item) => s + (item[agg.field] || 0), 0);
result[agg.name] = items.length > 0 ? sum / items.length : 0;
} else if (agg.type === 'min') {
result[agg.name] = Math.min(...items.map(item => item[agg.field] || Infinity));
} else if (agg.type === 'max') {
result[agg.name] = Math.max(...items.map(item => item[agg.field] || -Infinity));
}
});
return result;
});
}
// 使用示例
// const sales = [
// { region: 'East', product: 'A', amount: 100 },
// { region: 'East', product: 'B', amount: 200 },
// { region: 'West', product: 'A', amount: 150 },
// { region: 'West', product: 'B', amount: 250 }
// ];
//
// const result = groupAndAggregate(sales, 'region', [
// { name: 'count', type: 'count' },
// { name: 'total_amount', type: 'sum', field: 'amount' },
// { name: 'avg_amount', type: 'avg', field: 'amount' }
// ]);
// console.log(result);Rate limits / Best practices
速率限制/最佳实践
- ✅ Stream large files — Use jq with flag and process line by line for large datasets
-c - ✅ Validate data — Check JSON/CSV format before transformation
- ✅ Handle missing fields — Use default values for null/undefined fields
- ✅ Memory management — For files >100MB, use streaming parsers
- ✅ Type conversion — Be aware of number/string conversions in CSV
- ✅ Preserve data types — JSON maintains types, CSV converts everything to strings
- ⚠️ Character encoding — Ensure UTF-8 encoding for international characters
- ⚠️ Quote escaping — Properly escape quotes in CSV values
- ✅ 大文件采用流式处理 — 处理大型数据集时,搭配参数使用jq逐行处理
-c - ✅ 数据校验 — 转换前先检查JSON/CSV格式是否合法
- ✅ 缺失字段处理 — 为空/未定义的字段设置默认值
- ✅ 内存管理 — 大于100MB的文件使用流式解析器
- ✅ 类型转换注意事项 — 注意CSV中数字/字符串的自动转换问题
- ✅ 保留数据类型 — JSON原生支持多种数据类型,CSV会将所有值转为字符串
- ⚠️ 字符编码 — 确保多语言字符使用UTF-8编码
- ⚠️ 引号转义 — CSV值中的引号需要正确转义
Agent prompt
Agent提示词
text
You have JSON and CSV data transformation capability. When a user asks to transform data:
1. Identify the input format:
- JSON: Look for {...} or [...]
- CSV: Look for comma-separated values with headers
2. For JSON to CSV:
- Use jq with @csv filter: `jq -r '... | @csv'`
- Or csvkit: `in2csv -f json`
- Node.js: Convert array of objects to CSV string
3. For CSV to JSON:
- Use csvjson from csvkit: `csvjson file.csv`
- Node.js: Parse CSV headers and data rows into objects
4. For filtering/extracting:
- Use jq select(): `jq '.[] | select(.age > 25)'`
- Use csvkit csvgrep: `csvgrep -c column -m value`
- Node.js: Use Array.filter() and map()
5. For flattening:
- Flatten nested JSON objects into dot notation
- Convert nested structures to tabular format
- Handle arrays by stringifying or creating separate rows
6. For aggregation:
- Use jq group_by(): `jq 'group_by(.field) | map({...})'`
- CSV: Convert to JSON, aggregate, convert back
- Node.js: Implement grouping and aggregation functions
Always:
- Preserve data integrity (no data loss)
- Handle edge cases (empty values, special characters)
- Validate output format matches expected structure
- For large files (>100MB), recommend streaming approachestext
你具备JSON和CSV数据转换能力。当用户要求转换数据时:
1. 识别输入格式:
- JSON:查找{...}或[...]结构
- CSV:查找带表头的逗号分隔值结构
2. JSON转CSV场景:
- 使用jq的@csv过滤器:`jq -r '... | @csv'`
- 或使用csvkit:`in2csv -f json`
- Node.js:将对象数组转换为CSV字符串
3. CSV转JSON场景:
- 使用csvkit的csvjson工具:`csvjson file.csv`
- Node.js:解析CSV表头和数据行生成对象
4. 过滤/提取场景:
- 使用jq的select()方法:`jq '.[] | select(.age > 25)'`
- 使用csvkit的csvgrep:`csvgrep -c column -m value`
- Node.js:使用Array.filter()和map()方法
5. 扁平化场景:
- 将嵌套JSON对象扁平化为点分隔的键名格式
- 将嵌套结构转为表格格式
- 数组可通过序列化或拆分多行处理
6. 聚合场景:
- 使用jq的group_by()方法:`jq 'group_by(.field) | map({...})'`
- CSV场景:先转JSON,聚合后再转回CSV
- Node.js:实现分组和聚合函数
始终遵守以下规则:
- 保证数据完整性(无数据丢失)
- 处理边界情况(空值、特殊字符)
- 校验输出格式符合预期结构
- 大于100MB的文件推荐使用流式处理方案Troubleshooting
问题排查
Error: "parse error: Invalid numeric literal"
- Symptom: jq fails to parse JSON
- Solution: Validate JSON format with , fix syntax errors
jq empty file.json
CSV columns not aligned:
- Symptom: Data appears in wrong columns after transformation
- Solution: Check for unescaped commas in data, ensure quotes are properly escaped
Empty output from jq:
- Symptom: jq returns no results
- Solution: Check filter expression syntax, verify data structure matches filter
Special characters broken in CSV:
- Symptom: Non-ASCII characters appear garbled
- Solution: Ensure UTF-8 encoding:
iconv -f UTF-8 -t UTF-8 file.csv
Memory error with large files:
- Symptom: Process runs out of memory
- Solution: Use streaming mode: or Node.js streams for line-by-line processing
jq -c
JSON doesn't convert to flat CSV:
- Symptom: Nested objects create complex CSV structure
- Solution: Flatten JSON first before converting to CSV
错误:"parse error: Invalid numeric literal"
- 症状:jq无法解析JSON
- 解决方案:使用校验JSON格式,修复语法错误
jq empty file.json
CSV列不对齐:
- 症状:转换后数据出现在错误的列中
- 解决方案:检查数据中是否存在未转义的逗号,确保引号正确转义
jq返回空输出:
- 症状:jq执行后没有返回结果
- 解决方案:检查过滤表达式语法,确认数据结构与过滤器匹配
CSV中特殊字符乱码:
- 症状:非ASCII字符显示为乱码
- 解决方案:确认使用UTF-8编码:
iconv -f UTF-8 -t UTF-8 file.csv
大文件处理内存溢出:
- 症状:进程内存不足退出
- 解决方案:使用流式模式:或Node.js流逐行处理
jq -c
JSON无法转为扁平CSV:
- 症状:嵌套对象生成的CSV结构复杂
- 解决方案:转换为CSV前先扁平化JSON结构
See also
参考链接
- ../database-query-and-export/SKILL.md — Export database results as JSON/CSV
- ../web-search-api/SKILL.md — Transform API responses to desired format
- ../using-web-scraping/SKILL.md — Process scraped data into structured formats
- ../database-query-and-export/SKILL.md — 将数据库查询结果导出为JSON/CSV格式
- ../web-search-api/SKILL.md — 将API响应转换为所需格式
- ../using-web-scraping/SKILL.md — 将爬取的数据处理为结构化格式