Fast break AI: Databricks helps Pacers reduce ML costs by 12,000X% and speed up insights

Venturebeat/ideogram

Join our daily and weekday newsletters to receive the latest updates on AI coverage. Learn More


For Pacers Sports and Entertainment, data about fans are just as important.

Yet, while the parent company (PS&E) of the Pacers is the largest basketball team in the world, data about fans is just as valuable. Indianapolis Pacers (19459074) (NBA), Indiana Fever (19459074) (WNBA and the Indiana Mad Ants (19459074) (NBA G League), pumped untold sums of money into a $100,000 per year machine learning platform to generate models for factors such as ticket demand and pricing, but the insights were not coming fast enough.

Jared Chavez (manager of data engineering and strategic) set out to change this, moving to Databricks in Salesforce a year and a half ago.

Now? His team performs the same range of projects using careful compute configurations in order to gain critical insight into fan behavior – for only $8 a year. Chavez attributes the seemingly unthinkable, jaw-dropping decrease to his team’s ability reduce ML computation to near-infinitesimal levels.

He told VentureBeat that “we’re very good with optimizing our computation and figuring out how far we can push the limit to make our models run.” “That’s what we’ve always been known for at Databricks.”

PS&E reduces OpEx by 98%.

The Indianapolis-based PS&E runs three basketball teams and a Pacers Gaming business. It also hosts March Madness events, and runs an event business that is busy for 300 days or more. Gainbridge Fieldhouse (concerts and comedy shows, rodeos and other sporting events). The company announced last month that it would be building a $78million arena. Indiana Fever Sports Performance Center ( ) will be connected to the arena by a skybridge and a parking lot (expected to open in 2020).

This makes for a staggering amount of data and data sprawl. Chavez noted that, from a data infrastructure perspective, the organization had two completely independent warehouses, built on Microsoft Azure Synapse Analytics . The tooling and skill sets of different teams within the business varied greatly.

Although Azure Synapse was a great tool for connecting to external platforms it was too expensive for an organization the size of PS&E, he explained. Integrating the company’s ML with Microsoft Azure Data Studio caused fragmentation. Chavez switched to

to address these problems. Databricks AutoML (19459074) and the Databricks Machine Learning Workspace () in August 2023. Initial focus was on configuring, training and deploying models around ticket prices and game demand.

Both technical and non-technical users immediately found the platforms helpful, Chavez noted, and they quickly sped up the ML process (and plummeted costs).

ā€œIt dramatically improves response times for my marketing team, because they donā€™t have to know how to code,ā€ said Chavez. Itā€™s all buttons for them, and all that data comes back down to Databricks as unified records.ā€

Further, his team organized the companyā€™s 60-some-odd systems into Salesforce Data Cloud . He reports that they now have 440X the data in storage, and 8X the data sources in production.

PS&E operates today at just under 2% its previous annual OPEX cost. Chavez said that they saved hundreds of thousands per year on operations. “We invested it in customer data enrichment.” We invested in better tools for my team and the analytics units across the company.”

How did his team achieve such a low compute? Chavez explained that Databricks has continuously refined cluster configurations, improved connectivity options to schemas, and integrated model outputs into PS&Eā€™s data tables. The powerful ML engine “continuously refines, merges and predicts” on PS&Eā€™s customer records across all systems and revenue streams. Chavez reported that the AutoML models sometimes make it to production without further tweaking by his team. Chavez said that “truthfully, the only thing you need to know is the size of your data and how long it will take to train.” He said: “It is on the smallest possible cluster size, it may just be a memory optimized cluster, but it is just knowing Apache Spark pretty well and knowing how we could store and access the data fairly efficiently.”

Who is most likely to purchase season tickets?

Chavez’s team uses data, AI and ML to score propensity for season ticket packages. “We sell an ungodly amount of them,” he said.

It is the goal to determine which characteristics of customers influence where they sit. Chavez explained that the team is geolocating addresses to correlate demographics, income and travel distances. They are also analyzing the users’ purchase history across retail, food, and beverage, mobile apps, and other events that they may attend on PS&E campus.

They also pull in data from Stubhub Seat Geek, and other vendors outside Ticketmaster, to evaluate price points, and determine how well inventory is moving. Chavez explained that they can combine this data with all the information they have about a customer to determine where they will sit.

Armed to this data, they can then, for example, upsell a customer from Section 101 center court to Section 201. “Now we are able to resell not only his seat in a higher deck, but we can also sell a smaller package using the same characteristics on the seats he bought in mid-season,” said Chavez.

Data can also be used to improve sponsorships, which is crucial for any sports franchise.

Chavez said, “Of course they want to align themselves with organizations that overlap with theirs.” “So can we better enrich? Can we better predict the future? Can we do custom segments?”

The goal is to create an interface that allows users to ask questions such as: “Give me a segment of the Pacers fans in their mid-to late 20s who have disposable income.” Going even further, the interface could return a percentage which overlaps sponsor data.

When our partnership teams try to close these deals they can, on demand, pull information without having an analytics team do it for them, said Chavez.

In order to further support this goal his team is looking at building out a data-clean room, which is a secure environment allowing for the sharing sensitive data. This is especially useful for sponsors and collaborations with other teams, including the NCAA (whose headquarters are in Indianapolis).

‘Response time is the key for us at this point, whether it’s internal or external,’ said Chavez. “Can we reduce the knowledge required to cut up and sort information using AI?”

Data collection and AI for understanding traffic patterns, improving signage

Chavez’s team also focuses on examining where people are in the campus of PS&E (which consists of a three-tiered arena with an outdoor plaza). Chavez explained to us that data collection capabilities are available throughout the network infrastructure through WiFi access points.

When you walk into the stadium, your phone will ping all of them even if you do not log in, he said. “I can see you moving.” I don’t even know who you are but I can tell where you’re going.

This will help guide people in the arena, for example, if they want to buy a pretzel or find a concession stand, and it will help his team decide where to place food and merchandise kiosks.

Similarly, location data can help determine optimal spots for signage, Chavez explained. Placement of vision gradients at spots equal to the average fan height is an interesting way to determine signage impression counts.

Chavez said, “Let’s calculate the number of people surrounding them to see how well someone would have seen it walking through.” “So, I can tell my sponsors you got 5,000 views on this and 1,200 were really good.”

In the same way, when fans are sitting in their seats, they’re surrounded by digital displays and signs. Location data can be used to determine the quality and quantity of impressions depending on where the fans are sitting. Chavez said: “If the ad only appeared on the screen for ten seconds in the third quarter who would have seen the ad?”

His team plans to work together with PS&E once they have adequate locational data. Indiana University’s VR lab will model the entire campus. Chavez said, “We’re going to have this very fun sandbox where we can run around and answer all the 3D space questions I have been bugging myself with for the past two years.”

VB Daily provides daily insights on business use-cases

Want to impress your boss? VB Daily can help. We provide you with the inside scoop on what companies do with generative AI. From regulatory shifts to practical implementations, we give you the insights you need to maximize ROI.

Read our Privacy Policy.

Thank you for subscribing. Click here to view more VB Newsletters.

An error occured.


www.aiobserver.co

More from this stream

Recomended