Content as a Service
As one of Canada’s most successful integrated media and entertainment companies, Corus have multiple TV channels and websites for each channel.
It had been a challenge to have multiple channels' live schedule data displayed on websites. All the data are from a central repository. It became a little bit difficult since the repository is not always available. We had used Feeds module to import all the schedule data. Each channel website keeps a live copy of the schedule data. Things got worse because of the way we update the program items. We delete all the current schedule data in the system and then imported from the central repository. Sometimes, our schedule pages became empty because the central repository is not available.
Pedram Tiv, the director of digital operations at Corus Entertainment, had a vision of building a robust schedule for all channels. He wants to establish a Drupal website as a schedule service provider - content as a service. The service website download and synchronize all channels schedule data. Our content manager can also login to the website and edit any schedule items. The site keeps all the revisions for the changes. Since, the central repository only provide raw data, It is helpful we can edit the scheduled show title or series name.
I loved this brilliant idea as soon as he had explained it to me. We are building a Drupal website as a content service provider. It means we would build a CMS for other CMS websites. Scalability is always challenging for a modern website. To make it scalable, Pedram added another layer of cache protection. We added S3 cache between the schedule service and the front end web servers. With it, schedule service can handle more channels and millions of requests each day. Front end websites download schedule data from the Amazon S3 bucket only. What we did is creating and uploading seven days' schedule data to S3. We set up a cron job for this task. Every day, It uploads thousands of JSON schedule files for different channels in different time zones of next seven days each time.
This setup offloaded the pressure of schedule server and let it serve unlimited front end users. It gives seven days of grace period. It allowed the schedule server to be offline without interrupting the service. One time, our schedule service was down for three days. The schedule service was not affected because we have seven days of schedule data in an S3 bucket. By using S3 as another layer of protection, it provided excellent high availability.
Our schedule service have been up and running for many months without a problem. There are over 100,000 active nodes in the system. For more detail about importing large number of content and building an efficient system, we have some other blogs for this project.