New Product Simplifies Discovery of Consumers’ Profile and Behaviors via Big Data
BBC Worldwide First to Use Product as Flagship Client
Pittsburgh, PA, May 21, 2015 – Rhiza, an online platform pioneering the way marketers and salespeople make Big Data actionable, announced the launch of Rhiza for Marketing, a new tool designed to sort through large datasets and pinpoint actionable insights. Using the product, marketers can easily and efficiently discover granular characteristics on existing and prospective customers, faster than ever before.
Rhiza for Marketing was created to enable marketers to ask different questions about their customers and connect the dots between datasets to discover constellations of human behavior. The tool offers a range of visualizations to showcase data and simplifies the process of cross-tabbing through highly automated processes. Marketers can use the web-based tool to derive new insights about their customers and their competition, identify locations and behaviors of their most promising prospects, suggest winning strategies to reach targets and analyze the impact of a message across any type of media, including: television, print, digital and social. The tool also simplifies the process of profiling behavior, with the ability to create compelling infographics that compares data points across a segment, such as women, ages 18 to 25, who regularly watch television.
“We developed Rhiza for Marketing with marketers in mind,” said Josh Knauer, CEO of Rhiza. “Big Data has transformed the way marketers do their jobs, but using it and acting upon it can be challenging. Rhiza for Marketing lets users easily derive key insights from multiple, large datasets that can be used to power campaigns. Our goal in creating this product is to empower marketers to incorporate the available information into their decision making process so they can strategically develop better ways to engage with audiences.”
Rhiza for Marketing launched with BBC Worldwide as the flagship client, which uses the product to manage its proprietary data. With the tool, BBC Worldwide is working to enable its staff around the world to easily craft custom summary reports profiling its consumers. This includes reports such as the latest information on how an audience consumes television alongside other media consumption and how BBC can best engage them with its brands. With a wide range of content, business partners and countries, BBC Worldwide needs to enable its staff to customize and tailor these reports quickly and easily.
“We are using Rhiza for Marketing as a resource to empower our staff to quickly and easily cut, present and use the valuable data we have available,” said David Boyle, EVP, Insights at BBC Worldwide. “With Rhiza, my team is working on bringing together disparate data points, giving us a more complete picture of consumers and their behavior through a simple process. We are using Rhiza for Marketing to help our people around the world gain valuable insights that will allow them to build engagement with our shows.”
Rhiza for Marketing supports a range of data sources, from large commercial providers and third party data to proprietary datasets. Whether marketers are working with their own consumer survey data, internal brand data, or datasets from companies like Nielsen or IHS/Polk, Rhiza for Marketing enables them to seamlessly view all datasets in one place, with the ability to cross-reference between sources.
For an exclusive opportunity to join the next generation of data driven marketers visit: go.rhiza.com/RhizaforMarketing.
Rhiza is an online platform pioneering the way marketers and salespeople make Big Data actionable. The easy-to-use platform allows anyone to create a polished, data-driven marketing presentation in minutes, exported straight to PowerPoint, mobile devices or the web. With Rhiza’s tools, users can visualize, analyze, understand and share information derived from disparate data sets, delivering detailed recommendations based on integrated consumer insight. Rhiza, which is headquartered in Pittsburgh, PA, has increased revenue for media companies and consumer brands, and is used by a rapidly growing list of major brands, including: BBC Worldwide, Comcast, Univision, Cox Media, Gamut and Cox Reps. For more information, please visit rhiza.com.
We’re pleased to present our second edition of the Rhiza Tech Talk series with a new post by senior software engineer Matt Pickell. In this post, Matt talks about his process for placing dynamic shields on major roads for our geography-based data visualizations.
At Rhiza, we pull multiple datasets together to create very simple visualizations. Using vector tile data from Mapbox allows us to have a lot of control over these visualizations and produce customizable, compelling maps.
Vector tiles are packaged with separate layers containing all information needed to create a single map tile. One of these layers contains information for labeling the roads. The detail contained in this layer is sufficient to interpret the labels around pivots in the roads — making the text bend with the road. Label placement on maps is an interesting topic! The specific topic I will describe here is a little simpler: Placing shields on major roads.
When originally conceiving how to implement this we tested a very simple approach of summarizing the line-label segments into individual points and placing a road label at each point. For each label feature, take the center point in the line and place the shield there..
There were many issues with this approach; It did not consider the road holistically and resulted in poorly placed and frequently overlapping shields. The assumption was that the vector data spaced labels in some fashion…which it does not.
After taking a closer look at the data provided, we found that if you concatenate the line segments (matching on label text) in the road label layer you have basically recreated the road itself. Using this information we developed a method of label placement that considers each road as a whole and places labels by looking at all roads together. The result is cleanly spaced shields with no overlap, created on the fly as the user interacts with the map.
The first thing we need to do is to transform the label data into a new data structure grouping geojson features by road. Data manipulation can be performed on the data as well so that labels are formatted as desired, etc. The process discussed here starts from the geojson output of another process that has already downloaded all tiles, extracted the vector tile data, and filtered it based on the granularity of roads we want. So the geojson feature collection we have now contains the road label data from all tiles on the visible map. (These processes are performed in the client using web-workers.)
In order to split the data by road, we iterate over the geojson collection and check each label text. We’re specifically looking for highway shields so the strings are checked for validity using a simple regular expression:
As we check these labels, we run into complexities like a road that is named ‘US 22;US 30;I 376′. For cleaner placement of the shields, features like this are cloned: one feature with a label ‘US 22;US 30′, and one feature with the label ‘I 376′. The ‘US’ shields in the example represent a combined highway and are left together, placed together, and later styled to sit side-by-side on the map. If we encounter a route like ‘I 376 Business’, we have the choice to drop it or shorten it to ‘I 376B’, depending on what we want to show.
In an attempt to minimize what we process later, each valid line segment is checked against/placed in an rtree (we are using the rbush implementation. An rtree is used for spacial indexing, so that we can quickly and efficiently determine when two bounding-boxes overlap.) If a collision is detected in the rtree, then we have a line segment that already covers this section of road. A simple resolution here is to just take the larger (i.e., more points) segment and drop the shorter. You could also check for overlap and combine the segments, dropping the overlap, but we’re not drawing the roads with this information — we’re only trying to approximate the road so we can fit a line to it later. Our main goal here is to remove redundant line segments hoping reduce processing later.
The final step, then, is to extract the data from the rtree and place it into a ‘lookup’ mapping by road label. The result now is a map of road label -> array of road label features. This is a clean format for looking at each road as a whole in the next step: Placement.
When we place labels for each road, we want two things
- The labels should be spaced well along the individual road. Equal spacing, and only X shields per road length (where X is a value passed in to the web worker)
- Across all visible roads no labels should overlap
To achieve both of these with the best placement we prioritize how the road shields will be placed: interstates are more important than US highways, US highways trump the state roads. We then use another rtree in order to prevent any overlap. By prioritizing the processing order, more important labels are placed in the rtree first and have a better placement.
Now comes some fun with math in order to figure out the shield placement! At the end of the day, we simply need an X/Y position and label for each shield. To determine how many shields a road gets, we have a single variable passed in to the web-worker for how many shields we want per some linear-length on the visual map. This is just a general scaling to give an idea of how many shields we want per road, and the specific linear-length assumed here is “width of the map.”
In the prioritized order, we run each individual road through a placement process:
- Determine the bounding box of the road based on the points in the line segments. This will give us the endpoints that we need the shield(s) to be inside of when placed. We can also calculate the length of the line of best fit inside this bounding box in order to determine how many shields this road segment gets. This bounding box is clipped at the edges of the visible map if it extends outside so we’re not placing shields in non-visible space.
- Find the first-degree polynomial line of best fit and r-squared value for the road. Using a first-degree polynomial to represent the line does a pretty good job. The r-squared value tells us how well the line fits the data in the line segments. The point of using a line to approximate the road is that it is much easier to measure distances for placement of shields.
- If the r-squared value is too low, we are not going to get good placement of shields on the road. In this situation, the road features are sorted (by either latitude or longitude, depending on some interrogation of the data), and broken into X segments, where X is the number of shields that will be placed on that specific road. For each segment, a bounding box and line of best fit is calculated.
Note on r-squared
Calculating the r-squared for the line is important in order to make sure we actually are improving our placement, and not creating something worse.. Here is an example of I 79 near Pittsburgh. You can see that the best fit line is really not a good fit at all, and it has a very low r-squared.
When broken down in to sub-pieces, we get much better lines.
- Each road or road segment has its shields placed into the rtree. For each segment, points along the line of best fit are selected based on the number of shields to be placed on that road. For each point, an array is built with the actual points on the road, sorted by distance from the selected point. In order, each point is tested against the rtree to determine if it is a valid placement. Once a valid one is found, the label is placed and we move on to process the next road.
Finally, the shield-placement rtree now contains the only relevant information needed. The information in the tree is extracted into a new geojson feature collection and passed back to main ui thread for rendering on the chart. This rendering is fast and light since the web-worker reduced the road-label layer down to a collection of points. The result is a much cleaner, more intelligent (and dynamic) layout of the shields.
Rhiza is about more than just helping companies use data to make better decisions. We’re also about serving our community and helping people in need. That’s why once a quarter, the Rhiza team comes together and dedicates a Saturday morning to making a difference and volunteering at a local community services organization.
On Saturday, April 11th, the Rhiza team rounded up family members and spent the day packing bags, unloading groceries, and organizing canned goods at the Rainbow Kitchen, a local food bank and community services center in Homestead.
The Rainbow Kitchen, established in 1984, works to improve the quality of life for low-income families by providing support, programs and resources that address hunger and nutrition. The organization has a very small team and relies on volunteers to support operations.
The Rhiza team packed over 120 food bags each containing spaghetti noodles, tomato sauce, and canned vegetables. The bags were sent to two Pittsburgh-based senior centers to help elderly individuals who are unable to get out to get food on their own. Other volunteers unloaded groceries, while some stocked cans onto pallets for pantry day.
Later this summer, our team will also be holding a toiletry drive to aid individuals who are unable to purchase toothpaste and soap with food stamps. Leave us a comment if you’d like more information on how to contribute toiletries or if you have other ideas on how our team can help the community.
Our company culture is one that cares for our community, environment, and planet. Our team is comprised of not just python coders and SaaS sales stars, but also generous and considerate individuals, and we try to make an effort to extend our culture outside of our office. If you’re interested in joining the Rhiza team and looking for an opportunity to make a huge impact both in the workplace and in the community, check out our current open positions: (rhiza.com/careers)
A gang of young Rhizans hijacked the office on Thursday to take part in a two hour series of fun office activities. Rhiza celebrated Bring Your Kids to Work Day on Thursday by inviting all employees to bring their kids in for an afternoon filled with fun, snacks, and creative activities.
Ten kids, ranging in age from 4 to 10, took the office by storm and participated in several games that included creating a city out of cardboard boxes (complete with parks, a co-op and bike lanes), making visualizations with Post-It notes and bar graphs, and programming a robot to make a peanut butter and jelly sandwich. The result was a combination of chaos, creative thinking, and a ton of fun. The event allowed kids to use creativity, work in a team environment and eat healthy snacks while exposing them to some of the themes and concepts that run deep at Rhiza.
While yesterday’s program was uniquely created for this year’s Take Your Kids to Work Day, Rhiza actually let’s employees bring their kids to work any day. The company’s “kids in the workplace policy” is just one of the many perks of being a Rhiza employee, and allows parents to bring their children to work any time– a super convenient policy for parents during snow days and in the summer months.
If you’d like to be a part of a company experiencing explosive growth with fantastic perks and a family-friendly culture, check out our current openings (rhiza.com/careers).
A very important part of our company culture here at Rhiza is ensuring that we as a team take an active role in caring for our planet and environment. We demonstrate this by practicing eco-friendly habits in our workplace and encouraging friends and family members outside the office to do the same. Whether we’re recycling materials in the office or using our technology for the greater good of the planet, our organization was founded on the core belief that business should be conducted in a way that is positive for people, the planet, and profits.
This earth day, we wanted to promote our ideas and beliefs by using our tool to promote the areas of the country that also take an active role in environmental protection. Using the Rhiza platform combined with Experian Simmons Local survey data collected in the spring of 2014, we generated location-based data visualizations that show the different areas of the country that share our eco-friendly attitude.
The 2015 MLB regular season is just a couple of weeks old, and there is already a lot of buzz surrounding Rhiza’s hometown Pittsburgh Pirates. Following two consecutive winning seasons and two playoff runs, many have set high expectations for the team and ESPN’s Buster Olney even selected the Pirates as his pick to win the 2015 World Series.
However, as many longtime fans are aware, the twenty consecutive losing seasons that preceded weren’t as pleasant, and consequently, the team’s fanbase deteriorated while attendance dropped to all time lows.
So given the Pirates recent rebound and the current hype for the 2015 season, we were curious to learn more about the current state of the Pirates fanbase. We identified a few relevant questions that we wanted to answer: how does Pittsburgh stackup against the rest of the nation in terms of MLB fans? What NL Central team has the most MLB fans? What Pittsburgh neighborhoods have the most MLB fans? By using Simmons Local survey data and the Rhiza Platform, we were able to quickly uncover insights that allow us to answer these questions.
According to a Simmons Local survey that was conducted just a few months following the spectacular 2013 Wildcard game, 16.43% of the nationwide respondents claimed to be “Very Interested in Major League Baseball”. We took the total number of weighted individuals in each NL Central designated market area (DMA) that made the same claim and indexed it against the US average. On the scale, 100 is equivalent to 16.43%.
Pittsburgh, traditionally known as a “football town”, and more recently a “hockey town”, had significantly more respondents who are “very interested in Major League Baseball” than any other NL Central team and ranked significantly higher than the US average. Equally as surprising, Chicago came in dead last and indexed well beneath the US average. This could be contributed to the Cub’s poor 2013 season, and if so, is there a correlation between the number of fans and the 2013 regular season standings?
NL Central Teams by MLB Fans Indexed by DMA, Spring 2014
1. Pittsburgh Pirates
2. Milwaukee Brewers
3. St. Louis Cardinals
4. Cincinnati Reds
5. Chicago Cubs
2013 MLB Standings
1. St. Louis Cardinals (97-65)
2. Pittsburgh Pirates (94-68)
3. Cincinnati Reds (90-72)
4. Milwaukee Brewers (74-88)
5. Chicago Cubs (66-96)
While the results aren’t too far off, Milwaukee jumps out the most. The Brewers had a pretty dreadful 2013, but Milwaukee ranked second in its weighted percent of baseball fans which suggests they could possibly have a pretty dedicated fanbase.
Drilling down to Pittsburgh as a whole, the Rhiza platform allows us to see what neighborhoods are home to all of these Bucco fans and which areas still haven’t joined the bandwagon. This graphic shows the weighted percent of MLB fans in each Pittsburgh zip code indexed against the US average of 16.73%
Pittsburgh fans are clearly scattered throughout the city with a significant portion of them residing in western suburbs while some southwest communities are still feeling burned by the longtime losing streak and skeptical of recent success.
Just for fun, here are some maps for each of the other NL Central cities:
Let’s go Bucs!
We’re very excited to have Rhiza CEO, Josh Knauer, speaking at Argyle’s Chief Marketing Officer Spotlight Forum at 8:20am tomorrow, April 8th, 2015 in New York City.
As we continue our countdown to the Forum next week in New York, we thought we’d put together another post detailing the consumer trends in the financial industry. We hope to see you there!
Is Amex an Urban Credit Card?
Since we’re looking forward to meeting industry leaders at both American Express and MasterCard at next week’s event, we were curious to find what areas in our hometown of Pittsburgh have the highest percentage of MasterCard credit card and an American Express card holders. We pulled Simmons Local data and generated a map of Pittsburgh by designated marketing areas (DMAs).
First, we checked out American Express, and according to the results, the Amex cardholders are more confined to the Greater Pittsburgh Area and don’t separate out into surrounding towns and counties. We can also note that there are more Amex cardholders north of the city than the south, while Swissvale had the highest population. The southern most DMAs have the fewest Amex cardholders.
Next, we checked out MasterCard holders, and discovered they are much more widespread throughout the city than Amex users. Concentrated populations of MasterCard holders popup in the Greater Pittsburgh Area and exist in nearly every direction.
So do people in the city prefer Amex while suburban populations are more included to use MasterCard? Why are areas north of the city more likely to have Amex and Mastercard consumers than areas south of the city? To find out, we also decided to take a look at how many people in the Pittsburgh area use a credit card period.
Interestingly enough, people living downtown and areas northwest of the city are more likely to have credit cards than areas south and northeast. The fact that many individuals in the south don’t have credit cards in general could explain why Amex cardholders are pretty scarce.
In accordance with John Oliver’s request to bring back the true meaning of April Fools Day, Rhiza has decided to take the Last Week Tonight No-Prank Pledge. Instead of celebrating April Fools Day with pranks and hoaxes, we decided to conjure up some interesting data visualizations to prove that using data to tell a meaningful story is not a joking matter.
In the spirit of April Fools Day, we decided to find out what US regions have the highest populations of gullible people. After quickly looking through some Simmons Local data, we were able to generate a report of individuals in the US who claim to be easily swayed.
To take it one step further, we were also interested in drawing location-based comparisons between people who are easily swayed and those who are frequent users of a product or a service that has a reputation for being incredibly persuasive. So we turned our attention to WebMD.com. WebMD has received past criticism for persuading users that their medical symptoms could be signs of something more serious, and has even been referred to as a “prescription for fear” by the New York Times Magazine. So the question is: are states with the most gullible populations also the ones with the most WebMD users?
First, we used Simmons Local data and the Rhiza platform to figure out which areas of the country might be the most susceptible to April Fools Day pranks. We then overlaid the data across a map of designated marketing areas (DMAs).
The most significant conclusion is that the south is the most gullible region of the country, while the suspicious people of the northeast claim to be absolute in their beliefs. Areas of the southwest and small pockets in the northwest also rank above the national average.
Next, we used Simmons Local data to determine what DMAs have the most individuals who used WebMD.com in a 30 day timeframe.
The north and south swapped, while the west remained the same. Apparently the easily-persuaded states in the south aren’t as likely to use WebMD to be checking up on their symptoms, but the less gullible folks northeast are the heavy frequenters of the site.
Since the south contains the largest population of people who consider themselves easily swayed, then its low percentage of WebMD users might be a good thing. Otherwise, there might be a huge population of hypochondriacs beneath the Mason-Dixon line. Likewise, perhaps WebMD’s most productive users are in the northeast, where residents might be less likely to be convinced that minor aches and pains are signs of a terminal illness.
Based on the large population of easily-swayed people, one has to wonder if there is an increased number of April Fools Day pranks played in the southern states. If this is the case, then maybe we should all work hard to ensure the No Prank Pledge makes it way to the south in 2016.
Earlier this week, CNBC reached out to Rhiza to leverage our expertise in data visualization.
Following the announcement of the Heinz and Kraft merger on Wednesday, CNBC wrote a fascinating story detailing the location of Heinz ketchup and Kraft macaroni and cheese consumers. It also highlighted the Rhiza Platform’s ability to quickly combine different datasets with geography based visualizations in order to discover interesting trends and give data meaning.
The Rhiza Platform, which allows users to generate location-based data reports, can be used to quickly identify stories that aren’t readily apparent in an Excel spreadsheet. And since the reports are generated on the fly, Rhiza was able to quickly provide data visualizations to supplement CNBC’s story in a matter of minutes.
The results allowed CNBC to publish an interesting article that proves the northeast and the midwest as the main markets for Heinz ketchup and Kraft macaroni and cheese, respectively.
States with the most Heinz ketchup consumers
States with the most Kraft macaroni and cheese consumers
Taking it one step further, Rhiza was able to overlay both datasets over a single map in order to reflect the states have the highest concentration of both Heinz ketchup and Kraft macaroni and cheese consumers. The results show that the midwest has a higher percentage of people who consume both products than the northeast.
States with the most Heinz ketchup and Kraft macaroni and cheese consumers
Because of our partnership with Experian, Rhiza offers a wealth of data. When that data is combined with our cutting-edge visualization platform, we’re able to help companies answer tough questions like never before and uncover consumer insights quickly and efficiently.
St. Patrick’s day is an opportunity to celebrate Irish heritage in a variety of ways — whether by wearing green, watching a parade, or enjoying a variety of Irish beers. Of the Irish beers consumed on St. Patrick’s Day, Guinness, the popular Irish stout, traditionally leads the pack. According to WalletHub, over 13 million pints of Guinness will be consumed on St. Patrick’s Day in 2015.
So do states with the higher Irish populations drink more Guinness? Over at Rhiza, we decided to celebrate the holiday by finding out. According to the U.S. Census Report, nine of the top 10 states with the largest Irish-American population are in the northeast. Massachusetts ranked number one with 22.5% of the state’s population having identified as having an Irish ethnic origin.
Using Simmons Local survey data, we were able to generate visualizations that display the percentage of survey respondents who self-identified as Guinness drinkers. So is Massachusetts also number one in terms of Guinness consumers?
Figure 1: Who drinks Guinness the most – Northeast states
Interestingly enough, despite ranking 6th in Irish-American population, Vermont lands the number one spot for Guinness enthusiasts with a 4.7% of surveyed respondents claiming to enjoy the Irish stout. This is not only the highest percentage in the northeast, but also the highest across the country. Massachusetts is close behind with a 4.4%.
Based on Figure 1, it is definitely clear that there is a direct correlation between regions with a high Irish-American population and those who claim to enjoy Guinness, as the northeast leads the nation in both.
So how does the rest of the nation stack-up?
Figure 2: Who drinks Guinness the most – Midwest states
In second place is the midwest with the top three Guinness drinking states being the northern most: Minnesota, Wisconsin, and Michigan. An interesting pattern that is evident by the first two figures is that the percentage Guinness drinkers decreases as we move south. It’ll be interesting to see if that continues.
Figure 3: Who drinks Guinness the most – West Coast
As we can see in this figure, Guinness isn’t as popular on the west coast as it is on the east coast, and there is generally a smaller Irish population present in these states as well. However, Colorado possesses a fairly high percentage despite ranking 17th on the Irish population list.
Figure 4: Who drinks Guinness the most – Southern States
Home to both the smallest Irish population and fewest Guinness fans is the south. The biggest Guinness drinking state is Florida with a 2.8%; however, it is also true that Florida is home to many transplants from the north.
Based on the survey data, we can definitely confirm that Irish-Americans enjoy Guinness, and maybe even assume there is a direct correlation between Irish populations and Guinness consumption, and since the majority of Irish-Americans live in the north, Guinness drinkers in the south are a bit more sparse.
Happy St. Patrick’s Day!