C#CSV教程展示了如何在C#中读取和写入CSV数据。
CSV
CSV(逗号分隔值)是电子表格和数据库中使用的一种非常流行的导入和导出数据格式。CSV文件中的每一行都是一个数据记录。每条记录由一个或多个字段组成,以逗号分隔。虽然CSV是一种非常简单的数据格式,但可能存在许多差异,例如不同的分隔符、换行符或引号字符。
在本文中,我们使用CsvHelper库读写CSV数据。
$ dotnet add package CsvHelper
我们需要将CsvHelper包添加到我们的项目中。
C#CSV按记录读取数据
在下面的示例中,我们按记录读取CSV文件。
FirstName,LastName,Occupation John,Doe,gardener Roger,Roe,driver Lucy,Smith,teacher
我们有这个users.csv文件。
using System.Globalization;
using CsvHelper;
using CsvHelper.Configuration;
var csvConfig = new CsvConfiguration(CultureInfo.CurrentCulture)
{
HasHeaderRecord = false
};
using var streamReader = File.OpenText("users.csv");
using var csvReader = new CsvReader(streamReader, csvConfig);
string value;
while (csvReader.Read())
{
for (int i = 0; csvReader.TryGetField<string>(i, out value); i++)
{
Console.Write($"{value} ");
}
Console.WriteLine();
}
Read方法将读取器推进到下一条记录。我们使用TryGetField读取记录的字段。
$ dotnet run FirstName LastName Occupation John Doe gardener Roger Roe driver Lucy Smith teacher
C#CSV将数据读入对象
在下一个示例中,我们使用GetRecords将数据读入对象。
using System.Globalization;
using CsvHelper;
using var streamReader = File.OpenText("users.csv");
using var csvReader = new CsvReader(streamReader, CultureInfo.CurrentCulture);
var users = csvReader.GetRecords<User>();
foreach (var user in users)
{
Console.WriteLine(user);
}
record User(string FirstName, String LastName, string Occupation);
在示例中,我们定义了User类,并将users.csv文件中的记录读取到该类的实例中。GetRecords返回给定类型的IEnumerable。
$ dotnet run
User { FirstName = John, LastName = Doe, Occupation = gardener }
User { FirstName = Roger, LastName = Roe, Occupation = driver }
User { FirstName = Lucy, LastName = Smith, Occupation = teacher }
C#CSV配置
在下面的示例中,我们将创建一个包含分号分隔符和注释的CSV文件。为了解析这样一个不同的“CSV”文件,我们需要配置解析器。
# this is users.csv file John;Doe;gardener Roger;Roe;driver Lucy;Smith;teacher
我们有这个users.csv文件。
using System.Globalization;
using CsvHelper;
using CsvHelper.Configuration;
var csvConfig = new CsvConfiguration(CultureInfo.CurrentCulture)
{
HasHeaderRecord = false,
Comment = '#',
AllowComments = true,
Delimiter = ";",
};
using var streamReader = File.OpenText("users.csv");
using var csvReader = new CsvReader(streamReader, csvConfig);
while (csvReader.Read())
{
var firstName = csvReader.GetField(0);
var lastName = csvReader.GetField(1);
var occupation = csvReader.GetField(2);
Console.WriteLine($"{firstName} {lastName} is {occupation}");
}
阅读器配置有CsvConfiguration。
var csvConfig = new CsvConfiguration(CultureInfo.CurrentCulture)
{
HasHeaderRecord = false,
Comment = '#',
AllowComments = true,
Delimiter = ";",
};
我们告诉读者没有标题,注释字符是#。我们允许在文件中评论并设置评论的字符。(实际上,我们不必设置注释字符,因为默认情况下使用#。)我们将分隔符设置为分号字符。(默认忽略空白行。)
using var csvReader = new CsvReader(streamReader, csvConfig);
配置文件被传递给CsvReader。
$ dotnet run John Doe is gardener Roger Roe is driver Lucy Smith is teacher
C#CSV引用字段
在下面的示例中,我们展示了如何引用字段。
using System.Globalization;
using CsvHelper;
using CsvHelper.Configuration;
var csvConfig = new CsvConfiguration(CultureInfo.CurrentCulture)
{
ShouldQuote = args => args.Row.Index == 1
};
var users = new List<User>
{
new (1, "John Doe", "gardener", "12/5/1997"),
new (2, "Lucy Smith", "teacher", "5/12/1983"),
new (3, "Roger Roe", "driver", "4/2/2001"),
new (4, "Robert Smith", "cook", "21/11/1976"),
new (5, "Maria Smith", "accountant", "5/9/1986"),
};
using var fs = new StreamWriter("users.csv");
using var csvWriter = new CsvWriter(fs, csvConfig);
csvWriter.WriteHeader<User>();
csvWriter.NextRecord();
csvWriter.WriteRecords(users);
record User(int Id, string Name, string Occupation, string DateOfBirth);
我们有一个用户列表。我们决定引用每行的第二个字段。
var csvConfig = new CsvConfiguration(CultureInfo.CurrentCulture)
{
ShouldQuote = args => args.Row.Index == 1
};
在CsvConfiguration中,我们将ShouldQuote属性设置为对第二个字段返回true。
$ dotnet run $ cat users.csv Id,"Name",Occupation,DateOfBirth 1,"John Doe",gardener,12/5/1997 2,"Lucy Smith",teacher,5/12/1983 3,"Roger Roe",driver,4/2/2001 4,"Robert Smith",cook,21/11/1976 5,"Maria Smith",accountant,5/9/1986
C#CSV写入字段
记录中的字段使用WriteField写入。
using System.Globalization;
using System.Text;
using CsvHelper;
var users = new List<User>
{
new ("John", "Doe", "gardener"),
new ("Roger", "Roe", "driver"),
new ("Lucy", "Smith", "teacher"),
};
using var mem = new MemoryStream();
using var writer = new StreamWriter(mem);
using var csvWriter = new CsvWriter(writer, CultureInfo.CurrentCulture);
csvWriter.WriteField("FirstName");
csvWriter.WriteField("LastName");
csvWriter.WriteField("Occupation");
csvWriter.NextRecord();
foreach (var user in users)
{
csvWriter.WriteField(user.FirstName);
csvWriter.WriteField(user.LastName);
csvWriter.WriteField(user.Occupation);
csvWriter.NextRecord();
}
writer.Flush();
var res = Encoding.UTF8.GetString(mem.ToArray());
Console.WriteLine(res);
record User(string FirstName, string LastName, string Occupation);
在示例中,我们将CSV数据写入内存,然后写入控制台。
csvWriter.WriteField("FirstName");
csvWriter.WriteField("LastName");
csvWriter.WriteField("Occupation");
csvWriter.NextRecord();
首先,我们写标题。NextRecord方法添加一个换行符。
foreach (var user in users)
{
csvWriter.WriteField(user.FirstName);
csvWriter.WriteField(user.LastName);
csvWriter.WriteField(user.Occupation);
csvWriter.NextRecord();
}
WriteField将字段写入CSV文件。新记录以NextRecord开始。
writer.Flush();
要真正写入数据,我们需要调用Flush。
var result = Encoding.UTF8.GetString(mem.ToArray()); Console.WriteLine(result);
我们将内存中的数据写入控制台。
$ dotnet run FirstName,LastName,Occupation John,Doe,gardener Roger,Roe,driver Lucy,Smith,teacher
使用WriteRecords的C#CSV写入数据
在下面的示例中,我们使用WriteRecords一次性写入所有记录。
using System.Globalization;
using CsvHelper;
var users = new List<User>
{
new ("John", "Doe", "gardener"),
new ("Lucy", "Smith", "teacher"),
new ("Roger", "Roe", "writer"),
};
using var writer = new StreamWriter(Console.OpenStandardOutput());
using var csvWriter = new CsvWriter(writer, CultureInfo.CurrentCulture);
csvWriter.WriteHeader<User>();
csvWriter.NextRecord(); // adds new line after header
csvWriter.WriteRecords(users);
record User(string FirstName, string LastName, string Occupation);
在示例中,我们将用户对象列表中的数据写入控制台。WriteHeader从给定的成员中写入标头记录。
C#CSV自定义解决方案
通常,建议使用现有的库来处理CSV。尽管它被认为很简单,但提供一个健壮的解决方案并不容易。(例如,字段可能被引用。)
John Doe, gardener, 12/5/1997 Jane Doe, teacher, 5/10/1983 Robert Smith, driver, 4/2/2001 Maria Smith, cook, 9/11/1976
这是data.csv文件。
using System.Text;
var path = "data.csv";
var lines = File.ReadLines(path, Encoding.UTF8);
var users = from line in lines
let fields = line.Replace(", ", ",").Split(",")
select new User(fields[0], fields[1], DateTime.Parse(fields[2]));
var sorted = from user in users
orderby user.DateOfBirth descending
select user;
foreach (var user in sorted)
{
Console.WriteLine(user);
}
public record User(string Name, string Occupation, DateTime DateOfBirth);
该示例使用Linq解析CSV文件。它还按生日降序对用户进行排序。
$ dotnet run
User { Name = Robert Smith, Occupation = driver, DateOfBirth = 4/2/2001 12:00:00 AM }
User { Name = John Doe, Occupation = gardener, DateOfBirth = 12/5/1997 12:00:00 AM }
User { Name = Jane Doe, Occupation = teacher, DateOfBirth = 5/10/1983 12:00:00 AM }
User { Name = Maria Smith, Occupation = cook, DateOfBirth = 9/11/1976 12:00:00 AM }
C#导出HTML表格到CSV文件
在下一个示例中,我们从网站抓取HTML表格并将数据导出到CSV文件中。
对于网络抓取,我们使用AngleSharp库。
using System.Globalization;
using AngleSharp;
using CsvHelper;
var config = Configuration.Default.WithDefaultLoader();
using var context = BrowsingContext.New(config);
var url = "https://nrf.com/resources/top-retailers/top-100-retailers/top-100-retailers-2020";
using var doc = await context.OpenAsync(url);
var htable = doc.GetElementById("stores-list--section-23906");
var trs = htable.QuerySelectorAll("tr").Skip(1);
using var fs = new StreamWriter("data.csv");
using var writer = new CsvWriter(fs, CultureInfo.CurrentCulture);
var rows = new List<Row>();
foreach (var tr in trs)
{
var tds = tr.QuerySelectorAll("td").Take(3);
var fields = (from e in tds select e.TextContent).ToArray();
var row = new Row(fields[0], fields[1], fields[2]);
rows.Add(row);
}
writer.WriteRecords(rows);
record Row(string Rank, string Company, string Sales);
我们从包含2020年美国100家顶级零售商的表格中抓取数据。
var config = Configuration.Default.WithDefaultLoader(); using var context = BrowsingContext.New(config); var url = "https://nrf.com/resources/top-retailers/top-100-retailers/top-100-retailers-2020"; using var doc = await context.OpenAsync(url);
我们设置AngleSharp上下文并从提供的链接中检索文档。
var htable = doc.GetElementById("stores-list--section-23906");
var trs = htable.QuerySelectorAll("tr").Skip(1);
我们找到HTML表格并选择除标题之外的所有行。
using var fs = new StreamWriter("data.csv");
using var writer = new CsvWriter(fs, CultureInfo.CurrentCulture);
我们设置了CsvWriter。
foreach (var tr in trs)
{
var tds = tr.QuerySelectorAll("td").Take(3);
var fields = (from e in tds select e.TextContent).ToArray();
var row = new Row(fields[0], fields[1], fields[2]);
rows.Add(row);
}
我们从表的前三列中获取数据。
writer.WriteRecords(rows);
最后,使用WriteRecords将记录写入文件。
在本文中,我们使用CsvHelper库在C#中读取和写入CSV数据。
列出所有C#教程。
