Why Excel's built-in CSV functionality corrupts your data

In spite of Excel's vast array of features and its overwhelming popularity, Excel's built-in CSV functionality isn't just lacking — it's downright dangerous! Excel and its faulty CSV functionality is a leading corrupter of data for thousands of organizations worldwide.

Apr 12, 2025

Why Excel's built-in CSV functionality corrupts your data

Microsoft Excel is perhaps the most successful commercial software product on the planet and is used by millions of individuals and companies across the world. However, in spite of Excel's vast array of features and its overwhelming popularity, Excel's built-in CSV functionality isn't just lacking — it's downright dangerous! Excel and its faulty CSV functionality is a leading corrupter of data for thousands of organizations worldwide.
Data corruption occurs when data is lost or inadvertently and adversely changed. Excel's built-in CSV functionality frequently corrupts data both as part of the import and export processes. Often the act of merely opening a CSV file in Excel and then immediately saving it, without making any changes, will result in the underlying data in the CSV file being corrupted and irreversibly changed. Issues which lead to Excel commonly corrupting data include misidentifying the character encoding of files, reformatting text fields as numbers or dates, rounding or truncating numbers, converting numeric values to scientific notation and reformatting and truncating date and date-time values.
Since spreadsheet programs naturally display data in rows and columns, they are ideal for displaying and editing tabular data and CSV files. However, using Microsoft Excel to process CSV data is often more trouble than it's worth because its native CSV functionality is neither reliable nor configurable. Because of this many people avoid Excel's CSV functionality altogether and resort to writing bug-prone one-off code scripts or to using complicated command-line tools to edit and analyze their data.
We made POWER CSV so that Excel finally gets the first class CSV support it so desperately needs. The POWER CSV Excel add-on transforms Excel into the world's most powerful CSV editor and adds an invaluable set of tools for viewing, editing, and analyzing data. With POWER CSV, Excel can now handle CSV editing with ease and reliability, putting an end to the frequent data corruption issues commonly faced while using Excel.
Before we delve into the benefits of using POWER CSV, we must first understand Excel's limitations and how editing a simple CSV file can lead to data corruption.

Reasons why Excel can corrupt CSV data:

  1. The 'General' NumberFormat silently corrupts data
    1. The 'General' NumberFormat is a silent killer that, along with saving formatted values, causes most of Excel's data corruption issues. When opening a CSV file through Excel's file menu or via File Explorer, every column of data is imported using Excel's 'General' NumberFormat which causes Excel to make a 'best-guess' of each column's data type, often resulting in data being incorrectly formatted or even lost. Excel's 'best-guess' functionality is simply unreliable and causes various issues including stripping leading zeros from text values like ID and ZIP codes, reformatting and mangling date and date-time values, and truncating or rounding numerical values. This, coupled with Excel saving formatted cell values, is a recipe for disaster.
  1. Excel saves formatted cell values instead of saving actual cell values
    1. When saving a CSV file, Excel saves cell values precisely as they appear in the worksheet instead of saving the underlying cell values. This is particularly dangerous because the 'General' NumberFormat will often reformat numbers for display by showing a rounded value, sometimes formatted in scientific notation. It is these rounded values that are saved instead of the real, underlying, cell values with full precision. For example, the value 123456789012345 contains 15 significant figures and can be safely represented in Excel without losing any precision but it is displayed in Excel as 1.23457E+14. Excel's built-in CSV functionality will save it as 1.23457E+14 instead of saving the cell's true value. Likewise the floating-point number 0.1234567898 is displayed in Excel as 0.12345679 and will also be saved as such. Similarly date-time values can lose precision unless they are formatted in a NumberFormat that displays them with their full precision.
  1. Excel has problems with character encodings
    1. When opening CSV files Excel isn't effective at determining a file's character encoding. Too often it incorrectly assumes that a file is using the ANSI encoding (Windows-1252 on Western/US computers) resulting in international characters being lost and corrupted. This problem commonly occurs for files that use the UTF-8 character encoding without the Byte Order Mark (BOM) where Excel will incorrectly detect and import such files using the ANSI character encoding, causing Unicode specific characters to be corrupted.
      Until recently, Excel could only save CSV files using the ANSI encoding, causing any international characters in the spreadsheet to be corrupted and lost. Newer versions of Excel now have a new 'CSV UTF-8' format which can be used to save files; however, because many people aren't familiar with character encodings they inadvertently continue to save CSVs using Excel's legacy ANSI encoded CSV format.
  1. Excel's one-million-row limit
    1. Modern versions of Excel contain 1,048,576 (or $2^{20}$) rows per worksheet; however, for simplicity's sake, we usually say that there are one-million rows per worksheet. CSV files can contain well over one-million rows, and when opening such a large file in Excel, only the first one-million rows are imported, anything after that is ignored. If the file is later saved, under the same filename, then anything after the first one-million rows is lost.

Excel's built-in ways to open or import a CSV file

Excel has three built-in ways of opening CSV files, each with its issues and limitations:
  1. Opening via the File menu or File Explorer
    1. When Excel is installed on a computer it becomes the default application for the CSV file type and when a CSV file is double-clicked in File Explorer Excel automatically opens the file. Many people incorrectly assume that because of this default file association that CSV files are actually a type of Excel file and that Excel can safely edit them.
      When a CSV file is opened via the File menu or through File Explorer, no preview dialog is shown, the data is simply imported into a new workbook. Because a preview dialog is not shown, you are at the mercy of Excel's mediocre and dangerous auto-detection. Common issues include incorrect character encoding detection (usually by detecting 'UTF-8 without BOM' encoded files as 'Windows-1252' instead), incorrect delimiter detection (often causing all data to be imported into a single column when a comma is expected but a semi-colon is used or vice-versa), and worst of all corrupting data by using the 'General' NumberFormat (trimming leading zeros, truncating numeric values, treating text values as dates or numbers, and reformatting date and date-time values).
  1. Importing via the Text Import Wizard
    1. The Text Import Wizard is a more practical way to import CSV files into Excel that allows you to preview your data and manually change import settings before importing into Excel. The Text Import Wizard cannot automatically detect a file's character encoding nor can it detect its delimiter but you can manually specify them.
      All columns of data use the 'General' NumberFormat by default but if you don't want Excel to mess up things like text values containing leading zeros, date and date-time values and numeric values with a large number of significant figures, you can manually change those columns to import as text. You can only change one column's data type at a time, and there is no setting to import all columns as text so it can be quite cumbersome manually changing the data type for each column.
  1. Importing via the Power Query Editor
    1. The Power Query Editor has replaced the Text Import Wizard as the default method of importing CSV files in newer versions of Excel. A CSV file can be imported using the Power Query Editor by going to the 'Data' ribbon tab and clicking the 'From Text/CSV' button in the 'Get & Transform Data' ribbon group.
      The Power Query Editor can detect some character encodings although it only seems to be able to detect Unicode encodings when they contain a BOM and anything else is detected as the ANSI encoding(Windows-1252 on Western/US computers). This problem commonly presents itself for files which are UTF-8 encoded and do not contain a BOM.
      The Power Query Editor also has a leg up on the Text Import Wizard in that it can automatically detect the delimiter of the CSV file as well as having column data-type detection. While the column data-type detection is an improvement over the Text Import Wizard, it still has issues that can lead to data corruption, especially concerning the reformatting of date and date-time formats.
      While files with more than one-million rows can technically be imported using the Power Query Editor, only the first one-million rows can be imported into an Excel worksheet. If you want to see the full file, you can do so by clicking the 'Manage Data Model' button which opens a new 'Power Pivot for Excel' window — this is not a standard Excel window, cell values are not editable.

Excel's built-in ways to save or export a CSV file

Excel has several ways to save CSV files using its built-in functionality, all of which present a host of issues. All of these methods save the formatted cell values as they appear on the screen, which often leads to numbers being truncated and possibly converted to scientific notation. The built-in save functionality lacks flexibility and configurability and there is no way to choose the delimiter, text-qualifier and escape characters. There also isn't an option to choose the end-of-line format, quoting rule or the ability to save only the selected range or only the visible data. Only a handful of character encodings are supported and most of them are legacy encodings which should almost never be used.
Excel's built-in CSV save formats do not always use a comma as the delimiter and instead they use the system separator. The system separator is defined in the Windows operating system regional settings. In some European countries, semi-colons are commonly used as the system separator instead of commas; this is because most of those countries use commas as a decimal separator instead of a decimal point. Having a mixture of comma-separated and semicolon-separated CSV files can often cause confusion when files are passed back and forth between companies in North America and Europe. If your system separator does not match the system separator of the machine used to create the file then Excel will open the file using the wrong delimiter resulting in all data appearing in a single column.
You can select one of Excel's built-in CSV save formats by expanding the 'Save as type:' dropdown in Excel's 'Save As' window. The various ways to save a CSV file using Excel's built-in functionality are:
  1. CSV (Comma delimited) (*.csv)
    The 'CSV (Comma delimited) (*.csv)' format saves the CSV file using the ANSI encoding (Windows-1252 on Western/US computers). Since the ANSI encoding does not support international characters found in the Unicode character set, all such characters will be corrupted and lost when saved using this format.
    1. Despite saying that this format is 'Comma delimited' in its name, it uses the system separator found in the regional settings control panel. In European countries, a semi-colon is often used as the default system separator instead of a comma.
      Like all of the built-in Excel options for saving CSV files, numeric values stored in cells formatted with the 'General' NumberFormat are often corrupted by rounding and/or converting numbers to scientific notation.
  1. CSV (MS-DOS) (*.csv)
    1. The 'CSV (MS-DOS) (*.csv)' format saves the CSV file using the legacy character encoding DOS-437, commonly referred to as 'Extended ASCII'. Like other legacy encodings, DOS-437 does not support international characters found in the Unicode character set and any such characters will be lost when saving.
  1. CSV (Macintosh) (*.csv)
    1. The 'CSV (Macintosh) (*.csv)' format saves the CSV file using the legacy character encoding Mac-Roman which, like other legacy encodings, does not support international characters in the Unicode character set. It also uses 'CR' as the end-of-line format as opposed to the Windows standard 'CRLF'.
  1. CSV UTF-8 (Comma delimited) (*.csv)
    1. The 'CSV UTF-8 (Comma delimited)' format was finally added to Excel in November 2016. It uses UTF-8, the world's most popular character encoding, and is the best option to use when saving a CSV file using Excel's built-in functionality.
      There are actually two different variants of UTF-8 character encodings: 'UTF-8 (with BOM)' and 'UTF-8 (without BOM)'. A BOM (Byte Order Mark) is a sequence of bytes at the beginning of a Unicode text stream that specifies its character encoding. When Excel saves a CSV file using the 'CSV UTF-8 (Comma delimited)' format it includes a BOM. The Unicode standard, however, recommends not including a BOM for UTF-8 encoded files. Also, note that Excel's open and import functionality can only correctly identify UTF-8 encoded files that do contain a BOM. UTF-8 files that don't have a BOM are incorrectly identified as using the ANSI encoding causing international characters to be corrupted and lost when the file is opened or imported.
      Like the 'CSV (Comma delimited) (*.csv)' format, the saved file uses the system separator which may not necessarily be a comma. Like all of the built-in Excel options for saving CSV files, numeric values stored in cells formatted with the 'General' NumberFormat are often corrupted by rounding and/or converting numbers to scientific notation.
  1. Unicode Text (*.txt)
    1. The 'Unicode Text (.txt)' format outputs a tab-delimited text file using the UTF-16LE character encoding. Before the 'CSV UTF-8 (Comma delimited) (.csv)' option was added to Excel, the Unicode Text format was the only reliable way to export international characters from Excel. Note that the resulting file is tab-separated and is not technically a CSV file. Subsequent applications that process the file may not be compatible with this file type and in such a case you would need to look for a tool to convert the file into the appropriate format.

A demonstration of how Excel corrupts CSV data

If you don't believe that Excel can be as bad as we say, then try the following example and see for yourself.
To demonstrate some of these issues we will be using the sample CSV file DontCorruptMe.csv and we will be using the most recent Office 365 version of Excel as of this post. The Windows operating system is set up with 'English (United States)' regional settings — this is important as Excel will format date values differently based on regional settings. We will open the file using Excel's standard 'File ➔ Open' menu and we will save the file using Excel's UTF-8 save format.
Column1
Column2
Column3
Column4
Column5
Column6
Column7
Column8
Column9
Column10
你好,世界!
April25
000000123
1234567890123456
123456789012345
1234567890.98765
2012-07-06
2013-05-18 05:24:36
2014-11-22 13:52:26.123
2015-07-28 11:32:12.123456
The original unaltered CSV file (as seen in Notepad++)
The original unaltered CSV file (as seen in Notepad++)
 
The CSV file after being opened by Excel (using Excel's 'File ➔ Open' menu)
The CSV file after being opened by Excel (using Excel's 'File ➔ Open' menu)
 
The CSV file after being saved by Excel using the save format 'CSV UTF-8 (Comma delimited) (.csv)'
The CSV file after being saved by Excel using the save format 'CSV UTF-8 (Comma delimited) (.csv)'

Analysis of how Excel corrupted the file:

  • Column1 (Unicode text)
    • Original Value: 你好,世界!

      Saved Value: ä½ å¥½ï¼Œä¸–ç•Œï¼HOP

      Excel incorrectly determined the character encoding of the file to be the ANSI encoding ('Windows-1252' in this case) instead of 'UTF-8 (without BOM)'. Because of this, the Unicode characters in the field value are corrupted while opening the file and the corrupted field value is subsequently saved.
  • Column2 (Username that looks like a date)
    • Original Value: April25

      Saved Value: 25-Apr

      Excel imported the column using the 'General' NumberFormat which then made a 'best-guess' as to the field's format and automatically detected it as a date and reformatted its value to '25-Apr' using the 'dd-mmm' NumberFormat. Since Excel saves the formatted value, the original value is corrupted and lost when the file is saved.
  • Column3 (Leading zeros)
    • Original Value: 000000123

      Saved Value: 123

      Leading zeros are common in ID codes and US zip codes. In order to retain leading zeros, the column should be imported as 'Text' and not as 'General'. Excel imported the column using the 'General' NumberFormat which caused the field to be treated as number instead of text resulting in the leading zeros being stripped.
  • Column4 (Integer with 16 significant figures)
    • Original Value: 1234567890123456

      Saved Value: 1.23457E+15

      The number '1234567890123456' has 16 significant figures whereas Excel can only represent numbers with up to 15 significant figures without losing any precision. In order to retain the full precision of this number, the column should be imported as 'Text' and not as 'General'. Excel actually corrupts this number twice: first on import the field value is truncated to '1234567890123450' since Excel can only store 15 significant figures and finally when saving the file, the field value is further corrupted to '1.23457E+15' since Excel saves the formatted cell value.
  • Column5 (Integer with 15 significant figures)
    • Original Value: 123456789012345

      Saved Value: 1.23457E+14

      The number '123456789012345' has 15 significant figures and can be safely imported into Excel without losing any precision. However, Excel formats the cell with the 'General' NumberFormat and this causes it to display as a rounded value in scientific notation format. The actual cell value with full precision can still be viewed in the Excel formula bar, but when Excel saves the CSV file it saves the formatted value which is rounded and uses scientific notation.
  • Column6 (Floating point number with 15 significant figures)
    • Original Value: 1234567890.98765

      Saved Value: 1234567891

      The floating point number '1234567890.98765' has 15 significant figures and can be safely imported into Excel without losing any precision. Like the previous example, the value is formatted using the 'General' NumberFormat which causes it's display value to be rounded and the rounded value is saved instead of the cell's true value.
  • Column7 (Date 'yyyy-MM-dd')
    • Original Value: 2012-07-06

      Saved Value: 7/6/2012

      If a date is imported with the 'General' NumberFormat, Excel will reformat the date according to the short date format defined in the operating system's regional settings which in this case is 'M/d/yyyy' for the 'English (United States)' locale. In this instance the date value remains the same but the date format does not. A change in the date's format after saving can cause issues for subsequent programs that process the file which may be expecting the date in its original 'yyyy-MM-dd' format.
  • Column8 (Date-time 'yyyy-MM-dd hh:mm:ss')
    • Original Value: 2013-05-18 05:24:36

      Saved Value: 5/18/2013 5:24

      When imported using the 'General' NumberFormat, the date-time value is reformatted to the English (United States) regional date-time format 'M/d/yyyy h:mm'. Note that the regional date-time format hides the seconds in the formatted cell value. The seconds are still stored in the cell value and can be seen in the Excel formula bar; however, the formatted value is saved and the seconds are lost.
  • Column9 (Date-time 'yyyy-MM-dd hh:mm:ss.000')
    • Original Value: 2014-11-22 13:52:26.123

      Saved Value: 52:26.1

      This is a date-time value with millisecond precision and Excel is capable of storing this value with full precision. However, it is important to note that Excel's formula bar cannot display any floating-point precision for seconds (it will display the value in the en-US regional format as '11/22/2014 1:52:26 PM') and if you give the formula bar focus, the millisecond precision will be stripped from the cell value when the formula bar loses focus. Since this field is imported using the 'General' NumberFormat, Excel bizarrely decides to reformat the value as 'mm:ss.0'. When the CSV file is saved, the formatted value is saved and is unrecognizable when compared to its original value.
  • Column10 (Date-time 'yyyy-MM-dd hh:mm:ss.000000')
    • Original Value: 2015-07-28 11:32:12.123456

      Saved Value: 32:12.1

      This is a date-time value with microsecond precision and Excel can only safely represent up to millisecond precision. To prevent loss of data, this value should be imported as 'Text' but since the field is imported as 'General' it ends up being converted to and saved using the bizarre 'mm:ss.0' format.

POWER CSV - The better way to handle CSVs in Excel

notion image
POWER CSV is an Excel add-on that is accessible as an extra Excel ribbon tab containing a set of features that transforms Excel into a much more powerful tool for viewing, manipulating, transforming and analyzing CSV files and tabular data. At its core POWER CSV's import and export functionality were designed around data integrity and reliability.
POWER CSV is designed not only to be extremely reliable, but it is also the most configurable tool of its kind. Almost every option or setting of POWER CSV can be configured to best meet your work flow needs. With POWER CSV, you can stop worrying about Excel adversely altering your data and shift your concentration to where it should be: on manipulating and analyzing your data.
Below we will introduce you to the POWER CSV Import and Export functionality and how they deal with our sample CSV file.

POWER CSV Import & Open

POWER CSV contains two ways to bring CSV data into an Excel workbook: POWER CSV Import and POWER CSV Open. The primary difference between these two methods is that the Import functionality shows you an import preview dialog with configurable settings prior to importing the data whereas the Open functionality simply imports the data without showing a dialog. Both the Import and Open functionality can be configured to your liking via our powerful Settings dialog. Large CSV files containing millions of rows can also be imported by importing the data into multiple worksheets. POWER CSV's File Explorer integration enables you to import or open files directly from File Explorer and associated file types can be opened by double-clicking the file.
notion image
POWER CSV's Import and Open functionality provides powerful features such as automatic detection of delimiters, text-qualifiers, escape-characters, header rows, character encodings, and column data formats. Date and date-time values are formatted by default using the Excel NumberFormat which formats the date so that it appears in Excel exactly the same way it appears in the underlying CSV file. All default data formats can be overridden and dates can even be changed to a different date format prior to importing them (e.g. change dates from 'M/d/yyyy' to 'yyyy-MM-dd'). The default settings are set such that columns containing numbers and dates that cannot be exactly represented in Excel default to be imported as text so that no data is lost as part of the import process. Columns containing numeric values (with 15 significant figures or less) default to the 'General' NumberFormat and while Excel may still display some numbers by rounding them or by showing them using scientific notation their underlying cell values will still contain their full numerical precision. Unlike Excel's built-in CSV functionality, POWER CSV's export functionality knows to always write out 'General' formatted numbers using their full numerical precision.
For the sample file, you'll see that POWER CSV correctly determines the character encoding to be 'UTF-8 (without BOM)' which ensures that international characters are displayed correctly. Automatic data format detection determines whether each column contains text, numeric values or dates and is capable of detecting thousands of international date and date-time formats. Columns containing numbers and dates that cannot be exactly represented in Excel due to their high level of precision will have text as their default data format to preserve data integrity.
The Import Preview displays values exactly the same way they appear once they are imported into an Excel worksheet. Numeric columns imported with the 'General' number format may display some numbers as rounded or using scientific notation; however, Excel still maintains up to 15 significant figures of numerical precision regardless of how it displays those numbers and the POWER CSV Export functionality ensures that 'General' formatted numbers are always exported with full precision and are not inadvertently converted into scientific notation. Note that all detected data formats can be overridden and dates can easily be reformatted to different date formats prior to importing. The Advanced Import Settings dialog can configure additional options like the regional settings for how numbers are parsed, how the automatic data format detection should work for numbers with leading zeros or numbers with more than 15 significant figures, and how dates with time zone offsets should be formatted.
notion image

POWER CSV Export & Save

POWER CSV contains two ways to save CSV data from an Excel workbook: POWER CSV Export and POWER CSV Save. Similar to the difference between Import and Open, the difference between Export and Save is that exporting first shows you an export preview dialog whereas saving simply saves the file without showing you a dialog.
notion image
The POWER CSV Export dialog contains a multitude of options. You have the option to choose which character encoding to use as well as the ability to choose the delimiter, text-qualifier and escape characters. You can choose to export only the visible, or auto-filtered, data or to export all data. Data can be exported using the formatted cell values or using the actual cell values while formatting all dates and numbers to an international locale.
For the sample file, you can see that the POWER CSV Export dialog preview is identical to the source file. Note that even though we have chosen to output formatted cell values, numeric values with the 'General' NumberFormat will still be exported with their full numerical precision regardless of how Excel decides to display them in the worksheet. The 'Fixed-Point' Number Format setting is selected by default and ensures that numbers are not inadvertently converted to scientific notation.
notion image

Conclusion

In this article, we have examined some of the major pitfalls of Excel's built-in CSV functionality and how the POWER CSV Excel add-on can help you to reliably deal with CSV data. POWER CSV has many other notable features that were not covered in this article, specifically:
So be sure to check out our extensive documentation for more info. We would love to hear your thoughts so please don't hesitate to reach out in the comment section below.