@@ -1949,56 +1949,106 @@ module and use the same parsing code as the above to convert tabular data into
19491949a DataFrame. See the :ref: `cookbook<cookbook.excel> ` for some
19501950advanced strategies
19511951
1952- Besides ``read_excel `` you can also read Excel files using the ``ExcelFile ``
1953- class. The following two commands are equivalent:
1952+ Reading Excel Files
1953+ ~~~~~~~~~~~~~~~~~~~
1954+
1955+ .. versionadded :: 0.16
1956+
1957+ ``read_excel `` can read more than one sheet, by setting ``sheetname `` to either
1958+ a list of sheet names, a list of sheet positions, or ``None `` to read all sheets.
1959+
1960+ .. versionadded :: 0.13
1961+
1962+ Sheets can be specified by sheet index or sheet name, using an integer or string,
1963+ respectively.
1964+
1965+ .. versionadded :: 0.12
1966+
1967+ ``ExcelFile `` has been moved to the top level namespace.
1968+
1969+ There are two approaches to reading an excel file. The ``read_excel `` function
1970+ and the ``ExcelFile `` class. ``read_excel `` is for reading one file
1971+ with file-specific arguments (ie. identical data formats across sheets).
1972+ ``ExcelFile `` is for reading one file with sheet-specific arguments (ie. various data
1973+ formats across sheets). Choosing the approach is largely a question of
1974+ code readability and execution speed.
1975+
1976+ Equivalent class and function approaches to read a single sheet:
19541977
19551978.. code-block :: python
19561979
19571980 # using the ExcelFile class
19581981 xls = pd.ExcelFile(' path_to_file.xls' )
1959- xls.parse(' Sheet1' , index_col = None , na_values = [' NA' ])
1982+ data = xls.parse(' Sheet1' , index_col = None , na_values = [' NA' ])
19601983
19611984 # using the read_excel function
1962- read_excel(' path_to_file.xls' , ' Sheet1' , index_col = None , na_values = [' NA' ])
1985+ data = read_excel(' path_to_file.xls' , ' Sheet1' , index_col = None , na_values = [' NA' ])
19631986
1964- The class based approach can be used to read multiple sheets or to introspect
1965- the sheet names using the ``sheet_names `` attribute.
1987+ Equivalent class and function approaches to read multiple sheets:
19661988
1967- .. note ::
1989+ .. code-block :: python
19681990
1969- The prior method of accessing ``ExcelFile `` has been moved from
1970- ``pandas.io.parsers `` to the top level namespace starting from pandas
1971- 0.12.0.
1991+ data = {}
1992+ # For when Sheet1's format differs from Sheet2
1993+ xls = pd.ExcelFile(' path_to_file.xls' )
1994+ data[' Sheet1' ] = xls.parse(' Sheet1' , index_col = None , na_values = [' NA' ])
1995+ data[' Sheet2' ] = xls.parse(' Sheet2' , index_col = 1 )
1996+
1997+ # For when Sheet1's format is identical to Sheet2
1998+ data = read_excel(' path_to_file.xls' , [' Sheet1' ,' Sheet2' ], index_col = None , na_values = [' NA' ])
1999+
2000+ Specifying Sheets
2001+ +++++++++++++++++
2002+ .. _io.specifying_sheets :
19722003
1973- .. versionadded :: 0.13
2004+ .. note :: The second argument is ``sheetname``, not to be confused with ``ExcelFile.sheet_names``
19742005
1975- There are now two ways to read in sheets from an Excel file. You can provide
1976- either the index of a sheet or its name to by passing different values for
1977- ``sheet_name ``.
2006+ .. note :: An ExcelFile's attribute ``sheet_names`` provides access to a list of sheets.
19782007
2008+ - The arguments ``sheetname `` allows specifying the sheet or sheets to read.
2009+ - The default value for ``sheetname `` is 0, indicating to read the first sheet
19792010- Pass a string to refer to the name of a particular sheet in the workbook.
19802011- Pass an integer to refer to the index of a sheet. Indices follow Python
19812012 convention, beginning at 0.
1982- - The default value is ``sheet_name=0 ``. This reads the first sheet.
1983-
1984- Using the sheet name:
2013+ - Pass a list of either strings or integers, to return a dictionary of specified sheets.
2014+ - Pass a ``None `` to return a dictionary of all available sheets.
19852015
19862016.. code-block :: python
19872017
2018+ # Returns a DataFrame
19882019 read_excel(' path_to_file.xls' , ' Sheet1' , index_col = None , na_values = [' NA' ])
19892020
19902021 Using the sheet index:
19912022
19922023.. code-block :: python
19932024
1994- read_excel(' path_to_file.xls' , 0 , index_col = None , na_values = [' NA' ])
2025+ # Returns a DataFrame
2026+ read_excel(' path_to_file.xls' , 0 , index_col = None , na_values = [' NA' ])
19952027
19962028 Using all default values:
19972029
19982030.. code-block :: python
19992031
2032+ # Returns a DataFrame
20002033 read_excel(' path_to_file.xls' )
20012034
2035+ Using None to get all sheets:
2036+
2037+ .. code-block :: python
2038+
2039+ # Returns a dictionary of DataFrames
2040+ read_excel(' path_to_file.xls' ,sheetname = None )
2041+
2042+ Using a list to get multiple sheets:
2043+
2044+ .. code-block :: python
2045+
2046+ # Returns the 1st and 4th sheet, as a dictionary of DataFrames.
2047+ read_excel(' path_to_file.xls' ,sheetname = [' Sheet1' ,3 ])
2048+
2049+ Parsing Specific Columns
2050+ ++++++++++++++++++++++++
2051+
20022052It is often the case that users will insert columns to do temporary computations
20032053in Excel and you may not want to read in those columns. `read_excel ` takes
20042054a `parse_cols ` keyword to allow you to specify a subset of columns to parse.
@@ -2017,26 +2067,30 @@ indices to be parsed.
20172067
20182068 read_excel(' path_to_file.xls' , ' Sheet1' , parse_cols = [0 , 2 , 3 ])
20192069
2020- .. note ::
2070+ Cell Converters
2071+ +++++++++++++++
20212072
2022- It is possible to transform the contents of Excel cells via the `converters `
2023- option. For instance, to convert a column to boolean:
2073+ It is possible to transform the contents of Excel cells via the `converters `
2074+ option. For instance, to convert a column to boolean:
20242075
2025- .. code-block :: python
2076+ .. code-block :: python
20262077
2027- read_excel(' path_to_file.xls' , ' Sheet1' , converters = {' MyBools' : bool })
2078+ read_excel(' path_to_file.xls' , ' Sheet1' , converters = {' MyBools' : bool })
20282079
2029- This options handles missing values and treats exceptions in the converters
2030- as missing data. Transformations are applied cell by cell rather than to the
2031- column as a whole, so the array dtype is not guaranteed. For instance, a
2032- column of integers with missing values cannot be transformed to an array
2033- with integer dtype, because NaN is strictly a float. You can manually mask
2034- missing data to recover integer dtype:
2080+ This options handles missing values and treats exceptions in the converters
2081+ as missing data. Transformations are applied cell by cell rather than to the
2082+ column as a whole, so the array dtype is not guaranteed. For instance, a
2083+ column of integers with missing values cannot be transformed to an array
2084+ with integer dtype, because NaN is strictly a float. You can manually mask
2085+ missing data to recover integer dtype:
20352086
2036- .. code-block :: python
2087+ .. code-block :: python
20372088
2038- cfun = lambda x : int (x) if x else - 1
2039- read_excel(' path_to_file.xls' , ' Sheet1' , converters = {' MyInts' : cfun})
2089+ cfun = lambda x : int (x) if x else - 1
2090+ read_excel(' path_to_file.xls' , ' Sheet1' , converters = {' MyInts' : cfun})
2091+
2092+ Writing Excel Files
2093+ ~~~~~~~~~~~~~~~~~~~
20402094
20412095To write a DataFrame object to a sheet of an Excel file, you can use the
20422096``to_excel `` instance method. The arguments are largely the same as ``to_csv ``
0 commit comments