Case Study: High-Volume Reporting Requirement, Issues & Resolution

In this blog I’ll discuss some post-release reporting issues that we faced for one of our projects and the solutions we implemented. On the technology side, we had SQL Server 2008 R2 and MVC 4.0 application (which were hosted in Amazon Web Services) in our production environment.

The Problem

The post-production release reporting system was not responding as per the user expectations. For most of the high-volume reports (50K rows to 200K rows in report output), we were getting request timeout error. Client SLA for response time was two minutes; hence any report (big or small) must return data within two minutes. All the reports were designed using SQL Server Reporting Services 2008 R2. In all there were close to 40 reports with such timeout issues.

Initial Investigation

In the first phase of analysis, we realized that the database design was highly normalized. All data changes were captured in the same table using SCD2 methodology which was causing data to grow many folds. Most reporting queries were joining multiple views where each view had multiple tables joined together. There were queries which had around 30 or more tables joining together.  SSRS was taking more than two minutes to render the reports, and by that time the MVC web pages that used Report Viewer control were timing out. Increasing timeout limits webpages did not resolve the issue. Database redesign was out of the question as it had cascading effects on code base. Moreover, the project was already in production and these performance issues were faced post-release.

Solution Implemented

We created data-driven subscriptions for all high-volume SSRS reports and then provided end users with options to download these reports. High-level solution architecture is shown below:

This solution framework had three pieces:

  1. SSRS Data-Driven Subscriptions: We created subscriptions which were configured to create new Excel reports files every night. A new file was created every night for a given report and a combination of report parameters. All parameters were mapped to SQL queries — that is also the reason why we choose data-driven subscriptions and not standard subscriptions.
  2. SSIS package to update report metadata: As part of the solution we created a simple table used for storing the report metadata information. This included the report name, combinations of parameters which were used as report filters and report date. A SSIS package was created to read the list of reports created every night and load same in the report metadata table. This SSIS package was also responsible for cleaning up of old reports by moving them to Archive location. We had different retention periods for different report types.
  3. Download web page: A simple MVC webpage, as shown in the screenshot below, was created for the report download purpose. This page first authenticates users based on their logins and then allows users to download the Excel files according to their access levels. With this solution in place, there was no load on SQL Server and end users were able to get all high-volume reports (sometimes as big as 80MB) within two minutes. The download webpage did not have any timeout errors as the download started as soon as a user clicked on the download link. SQL Server and SSRS were not involved in this process.

Other than the three pieces mentioned above, we also did couple of minor enhancements as explained below:

  • T-SQL improvements: We modified a few reporting stored procedures wherein we replaced complex CTE and views with temp tables with indexes. SQL enhancements produced up to 50% performance gains, but only from the SQL Server side. SSRS still takes lot of time to render huge datasets.
  • Exporting 100K+ rows to Excel: In SSRS 2008 R2 export to Excel creates .XLS file which has a limit of 65K rows per sheet. This caused issues with most high-volume reports. This was resolved by pragmatically creating page breaks at every 50K rows, using conditional expressions in SSRS. Page breaks resulted data to be moved to next sheet in same Excel file.

Development Efforts

We did not want to redesign or redevelop report using .NET or any other tools as that would take lot of time. So as part of the solution data-driven subscriptions were created for each of the existing reports. This required minimum development efforts as subscriptions is a much simpler process comparatively. Also creating a single SSIS package and a simple report download page did not take much time. Finally, there were no extra licensing factors as data-driven subscriptions and SSIS are part of the SQL Server license.

Deployment

For releasing the subscription to production, we used rs.exe utility to deploy RDL files and subscriptions scripts in UAT and production environments. At the time of writing of this blog, more than 1,800 new reports have been generated every night by the framework and overall there are more than 10,000 reports (Excel files) available for quick download from the website.

Post-Deployment

No major issues were faced post-deployment. We did fine-tune subscription schedules based on production experience.

Advantages of the Framework:

  1. No need to recreate or redesign reports: As data-driven subscriptions is part of SSRS Architecture, it allowed use of SSRS reports created by the development team and hence no additional report development or modification efforts were required. Data-driven subscriptions provided the delivery platform for quick download of pre-created and pre-populated reports.
  2. Flexible: This framework is flexible as new reports can be added/removed from subscriptions without any code changes. Report parameters are generated at the run time based on data present in database, hence addition of new regions and customers won’t need any changes to subscriptions.
  3. Output format: This framework is also flexible enough to allow report creation in other format like PDF and DOC without any addition coding. We can have one report available in more than one format for download, if required.
  4. Reduced load on SQL: As reports are created every night, no interaction with SQL Server is needed during the day while downloading the reports. This improved systems performance and the load on SQL Server does not increase with the increase in number of reporting users.
  5. Logging: Subscriptions are logged when created and executed. This helps in troubleshooting.

Limitations of the Framework:

  1. Data is old: The reports created were always updated as per yesterday’s data. Client was fine with that much latency for the time being. Later on we implemented an on-demand reporting framework for getting near-live data. There will be a separate blog on on-demand reporting.
  2. Additional Monitoring: With data-driven subscriptions in place, DBAs have one more thing that they need to monitor. As per our post-release experience, however, this monitoring is not that much demanding.
  3. SQL Server Standard Edition: Please note that data-driven subscriptions feature of SSRS is not available in Standard and Express editions of SQL Server.

Conclusion and Lessons Learned

Normalization is good but reporting needs should also be considered while designing any database. If required, de-normalization should be done to enhance the performance of reports which will be executed frequently. Architects should consider a separate reporting database (or if possible, a data warehouse) to cater reporting needs. This becomes even more important if the database size is huge and data is updated frequently. DW is the final solution to enterprise level reporting needs. Features like data-driven subscriptions and other options should be considered as a possible report delivery mechanism.

Side note: It’s tough to capture all implementation details in a single blog, hence only high level details are covered here. We have tried to not miss any important points. Please feel free to comment with questions, or email me for more details and scripts.

About Yogesh Goswami

Yogesh Goswami is a Development Lead at AIS with more than 12 years of experience in the software development. His passion lies in the SQL Server Database and Microsoft BI technologies. Prior to joining AIS, Yogesh worked with Accenture, EMC2 and Microsoft; mostly on SQL DW and SQL BI projects. He has presented sessions on Azure SQL at Azure events. His credentials include being MCP since 2002 and having certifications in MCDBA, MCSD, MCTS SQL 2005, MCITP SQL BI 2008 and MCSA SQL 2012.

  • yogirk

    Nice post. Looking forward to your next posts!