Based in Ann Arbor, Michigan, ProQuest is committed to empowering researchers and librarians around the world. Through partnerships with content holders, ProQuest preserves rich, vast, and varied information and packages it with digital technologies that enhance its discovery, sharing, and management. ProQuest’s growing content collection encompasses ninety thousand authoritative sources, six billion digital pages, and spans six centuries. It includes the world’s largest collection of dissertations and theses; twenty million pages and three centuries of global, national, regional, and specialty newspapers; and more than 450,000 ebooks. ProQuest serves thousands of academic, government, and corporate campuses in over one hundred countries worldwide.
ProQuest relies on measuring content usage to settle the amount of royalties paid to publishers, set subscription prices with librarians, evaluate user satisfaction, and drive the development of new features. However, the company has inherited the legacy IT systems from its various acquisitions and was struggling to provide a clear, accurate view of usage across its product lines. The company had to spend valuable staff hours to manually answer the most urgent needs and in many cases the required data wasn’t available at all at a granular level. “We needed a solution to make usage a market opportunity and contribute to the business,” says Rajan Odayar, vice president, head of Big Data Analytics and Global Enterprise Management Solutions at ProQuest.
ProQuest already relies heavily on Amazon Web Services for its production systems. After benchmarking various technology solutions, the company chose AWS Redshift as the core usage repository for corporatewide data-usage tracking. Squid Solutions helped ProQuest implement their new vision. Squid, a company headquartered in Paris, France, with offices in San Francisco, California, provides consulting services and Usage Analytics software optimized for Redshift. “With Squid Solutions and Redshift, we can answer the needs of our various internal users in customer service, sales, product managers, and marketing. They can easily query years of data at any level of granularity, filter it on dozens of dimensions, and consolidate usage across account hierarchies and product lines,” says Rajan.
Squid and ProQuest set up a fully redundant Usage Analytics platform in two AWS regions. They spawn Elastic Compute Cloud (EC2) instances from private AMIs to host the Squid Solutions analytics server. For security reasons, EC2 instances are hosted within Virtual Private Cloud (VPC) networks. Elastic Load Balancers (ELB) are used as gateways between the users and Squid Solutions’ Usage Analytics software. Route 53 (Amazon’s DNS service) checks the health of both regions and points users to the healthy EC2 instance. Finally, Simple Storage Services (S3) buckets are the spooling recipients for Redshift data loading. Within the same VPC network, the Redshift clusters combine HDDbased (Dense Storage) slices for data-processing tasks and the execution of lowlevel data granularity queries, and SDDbased (Dense Compute) slices to increase the speed of mainstream analytical queries and reporting. “We benchmarked both systems and found this hybrid architecture to be the most adapted to ProQuest’s needs,” says Adrien Schmidt, CEO of Squid Solutions.
“Squid was instrumental in helping us set up the system in record time,” says Rajan, “and that was key to getting approval from our senior leadership team.” In fact, the final decision was made after an eightweek Proof of Concept during which 5TB of data were loaded from eighteen different sources.
“It’s so easy to set things up in AWS that we started loading data on the third day of the project and could show our first dashboard within ten days,” says Adrien. “That really got business users involved in the project and helped us make sure that we were answering their needs.”
Now that the Usage Analytics platform is up and running, usage metrics are available in multiple forms adapted to the needs of the different business users. The sales team, for instance, accesses usage metrics directly in their sales force interface—a must-have for people who manage dozens of accounts.
“We wanted Usage Analytics at the center of our business,” says Rajan, “and that’s what we got from AWS and Squid.”