My task was to write Excel in a bulk. So here is my analysis:
Pros:
- Open XML SDK is an great gift from Microsoft because you do not have to be dependent on installing Microsoft Office. When reading/generating an Excel file through other methods, you need to have installed on the server machine the office runitme dlls and in some cases the office should be installed too. While generating the excel file, always an instance of the Excel.exe is created on the server machine, meaning that if 50 files are getting gnerated 50 Excel.exe instances will be opened on the server machine, which could be stessful for the server performance and also handling the closing of the instances after the generation is finished is another problem. But in case of Open XML SDK, all you need to have is to install Open XML SDK and nothing else. You do not need to have Office on the server machine and no instances are created, so the performance of the server is always in control.
- It is a solid choice if you want to generate template-based Office documents.
- It is completely interoperable, meaning the excel document created in this format can be worked on across any platform.
- The SDK is stable and it is supported by Microsoft.
- It is easy to use.
- It has Open XML SDK productivity tool to generate code.
- LINQ can be used to navigate data of excel file.
- It performs comples operations with just a few lines of code.
- It simplifies the task of manipulation Open XML packages and the underlying Open XML schema elements within a package.
- OOXML documents are essentially zipped XML files and Open XML SDK is a collection of classes that allows you to work with the content of OOXML documents in a strongly-typed way. That is instead of unzipping a file to extract XML, loading that XML into a DOM tree and working with XML elements and attributes directly. It provides classes to do that.
Cons:
- It can not render Office functions. The page numbers of a TOC (Table of Content) or the actual page numbers of your Word document are not rendered until the user refreshes the document. The same is true for Excel calculations, so you can't do this rendering/calculation on the server-side.
- It supports only the documents created in Office 2007 or later(i.e. xlsx, docs etc), which means that you wish to work with documents created in the older versions of the office, it will now work as desired. But the documents that are created in Office 2007 or later and saved in Office 97-2003 format, will still work.
- It is a dead slow on manipulating the records while older methed like COM Interop libraries are too much fast in this case. My system has following specifications:
Processor: Intel Core i3, 3.30 GHz
RAM: 4 GB
System Type: 64 Bit OS
I had to update 4025 records in an Excel file and this task took 33 minutes on my system and when I used COM Interop libraries, it just took 33 seconds to update that excel file. Extreme difference indeed. Obviously it is wrapper on Office libraries, it saves the data in a stream and it has to manage even internal detail of the document by itself while the many internal details of the document are handled by Interop libraries themselves. But if you are working with small amount of data then it is great option to use but with bulk data, you have to analyze it first then suggest someone to use it as a primary library to Office documents automation. Do not suggest it without analyzing it first.
--------------------------------------------------
COM Interop:
Pros:
- Use native Microsoft libraries
- Fast and reliable
Cons:
- Use native Microsoft libraries because Office should be installed otherwise they can not be used. In other words Microsoft Office is the dependency to use it.
- Dependency/Version matching issues
- Concurrency/data integrity issues for web use when reading
- Scaling issues for web use (different from concurrency): need to create many instances of heavy Excel app on the server.
Pros:
- Open XML SDK is an great gift from Microsoft because you do not have to be dependent on installing Microsoft Office. When reading/generating an Excel file through other methods, you need to have installed on the server machine the office runitme dlls and in some cases the office should be installed too. While generating the excel file, always an instance of the Excel.exe is created on the server machine, meaning that if 50 files are getting gnerated 50 Excel.exe instances will be opened on the server machine, which could be stessful for the server performance and also handling the closing of the instances after the generation is finished is another problem. But in case of Open XML SDK, all you need to have is to install Open XML SDK and nothing else. You do not need to have Office on the server machine and no instances are created, so the performance of the server is always in control.
- It is a solid choice if you want to generate template-based Office documents.
- It is completely interoperable, meaning the excel document created in this format can be worked on across any platform.
- The SDK is stable and it is supported by Microsoft.
- It is easy to use.
- It has Open XML SDK productivity tool to generate code.
- LINQ can be used to navigate data of excel file.
- It performs comples operations with just a few lines of code.
- It simplifies the task of manipulation Open XML packages and the underlying Open XML schema elements within a package.
- OOXML documents are essentially zipped XML files and Open XML SDK is a collection of classes that allows you to work with the content of OOXML documents in a strongly-typed way. That is instead of unzipping a file to extract XML, loading that XML into a DOM tree and working with XML elements and attributes directly. It provides classes to do that.
Cons:
- It can not render Office functions. The page numbers of a TOC (Table of Content) or the actual page numbers of your Word document are not rendered until the user refreshes the document. The same is true for Excel calculations, so you can't do this rendering/calculation on the server-side.
- It supports only the documents created in Office 2007 or later(i.e. xlsx, docs etc), which means that you wish to work with documents created in the older versions of the office, it will now work as desired. But the documents that are created in Office 2007 or later and saved in Office 97-2003 format, will still work.
- It is a dead slow on manipulating the records while older methed like COM Interop libraries are too much fast in this case. My system has following specifications:
Processor: Intel Core i3, 3.30 GHz
RAM: 4 GB
System Type: 64 Bit OS
I had to update 4025 records in an Excel file and this task took 33 minutes on my system and when I used COM Interop libraries, it just took 33 seconds to update that excel file. Extreme difference indeed. Obviously it is wrapper on Office libraries, it saves the data in a stream and it has to manage even internal detail of the document by itself while the many internal details of the document are handled by Interop libraries themselves. But if you are working with small amount of data then it is great option to use but with bulk data, you have to analyze it first then suggest someone to use it as a primary library to Office documents automation. Do not suggest it without analyzing it first.
--------------------------------------------------
COM Interop:
Pros:
- Use native Microsoft libraries
- Fast and reliable
Cons:
- Use native Microsoft libraries because Office should be installed otherwise they can not be used. In other words Microsoft Office is the dependency to use it.
- Dependency/Version matching issues
- Concurrency/data integrity issues for web use when reading
- Scaling issues for web use (different from concurrency): need to create many instances of heavy Excel app on the server.
Open XML SDK is not a wrapper for Office! it is completely written in .net.
ReplyDelete