Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Video Advertising by Twitch Influencers*, Lecture notes of Marketing

Edinboro University of Pennsylvania (EU)Marketing

We study the effectiveness of influencer marketing in the video game industry. To this end, we construct a novel dataset on video game ...

Typology: Lecture notes

2021/2022

Uploaded on 09/12/2022

anvi 🇺🇸

4.8

(4)

1 document

1 / 50

Partial preview of the text

Download Video Advertising by Twitch Influencers* and more Lecture notes Marketing in PDF only on Docsity! Video Advertising by Twitch Influencers* Yufeng Huang University of Rochester Simon Business School Ilya Morozov Northwestern University Kellogg School of Management March 23, 2022 Abstract We study the effectiveness of influencer marketing in the video game industry. To this end, we construct a novel dataset on video game streaming on Twitch.tv, the largest video game streaming platform in the world, by monitoring live streams every 10 minutes for eight months. Leveraging these high-frequency data, we isolate plausibly exogenous variation in streamers’ daily schedules and use it to estimate the extent to which live streaming brings additional players into broadcasted games. We find that organic live streams only marginally increase the number of concurrent players in these games. We also find that sponsored streams solicited by game publishers are even less effective than organic streams, implying that sponsored streams generate, on average, negative return on investment (ROI). This result suggests that influencer promotions are less effective than previously thought. We then examine heterogeneous returns to streaming by estimating generalized random forests, and we find that sponsored streaming can significantly benefit games released by small publishers, inexpensive games, and “niche” games that strongly appeal to small groups of consumers. Therefore, despite the negative average ROI, influencer promotions may generate high returns when they promote games by lesser-known publishers or inform consumers about appealing game attributes. *Contacts: Yufeng Huang, email: yufeng.huang@simon.rochester.edu; Ilya Morozov, email: ilya.morozov@kellogg.northwestern.edu. We thank seminar and conference participants at University of Delaware, HKUST, Marketing Science Conference, and University of Rochester for their helpful comments and suggestions. We also thank James Ryan for excellent research assisstance. All errors are our own. 1 1 Introduction Over the last decade, influencer marketing has grown into a $14 billion industry.1 This rapid growth is partly driven by the increasing popularity of major video content platforms, such as Twitch, YouTube, and TikTok, where influencers distribute their content and build their own fan communities. By 2022, YouTube has become the second most visited website in the world that attracts 30 million visitors per day, and Twitch has grown into a video game streaming giant that hosts over 100,000 live channels and attracts over 2.5 million viewers at any point in time.2 These large audiences create a unique opportunity for companies to promote products and services by having video influencers showcase them during live streams. In fact, many practitioners believe that because influencers can better engage with their audiences and appear more trustworthy, in- fluencer promotions have greater return on investment (ROI) than traditional advertising.3 This general excitement about influencer marketing, as it turns out, has little empirical support. When explaining why influencer promotions are effective, many practitioners cite anecdotes in which influencer marketing supposedly generated extraordinarily high ROI.4 Nevertheless, indus- try observers caution against putting too much weight on these anecdotes, emphasizing that the industry has yet to devise reliable ways of measuring the effectiveness of influencer promotions.5 Absent experimental variation, measuring the effects of influencer marketing poses a significant empirical challenge. Influencers can often choose which products to promote and when to pro- mote them. They may choose to promote high-quality products at a time when product sales are already trending up, which generates a simultaneity bias familiar from the literature on the effects of word-of-mouth (Seiler et al., 2018) and advertising (Shapiro et al., 2020). In fact, the prior work on influencer marketing emphasizes this simultaneity bias as the central empirical challenge (Li et al., 2021; Yang et al., 2021). Addressing this challenge would help researchers and practitioners understand whether influencer promotions are indeed a highly effective marketing channel. We study the effectiveness of influencer marketing in the video game industry, and we collect unique high-frequency data that enable us to address the main identification challenge. Specifically, we ask how video game streaming on Twitch affects the popularity of broadcasted games, defined as the number of their concurrent players. Twitch is the largest video game streaming platform in the world, hosting over 90% of all streaming content.6 Streamers broadcast their gaming sessions 1Influencer marketing industry has grown to $13.8 billion by 2021 (Influencer Marketing Hub, 2021). 2YouTube statistics come from the official platform’s blog (www.blog.youtube), and Twitch statistics are taken from a third-party Twitch monitoring website (twitchtracker.com/statistics). 3See the 2019 industry report by MediaKix (Bailis, 2019). 4Nielsen Catalina Solutions conducted a correlational study, concluding that influencer marketing generates 11 times higher ROI than online banner ads. 5See https://influencermarketinghub.com/influencer-marketing-roi/. 6See the Streamlabs and StreamHatchet Quarterly Report for Q3 (2020). 2 advertising for video games (Grossman and Shapiro, 1984; Ackerberg, 2003). As a whole, our results show that despite the negative average ROI, Twitch influencer promotions may generate high positive returns when they are used to promote games by lesser-known publishers or to inform consumers about appealing game attributes. Our paper contributes to the literature on the effectiveness of marketing media. Much of the existing literature focuses on estimating the effects of TV advertising (Lodish et al., 1995; Li- aukonyte et al., 2015; Shapiro et al., 2020), online advertising (Johnson et al., 2017; Gordon et al., 2019), and word-of-mouth (Lovett and Staelin, 2016; Seiler et al., 2017). We contribute by study- ing the effects of influencer marketing, an emerging marketing channel that has gained momentum over the past few years. The closest related papers are Li et al. (2021) on the effect of YouTube influencer content about gaming and Yang et al. (2021) on the effect of TikTok influencer videos. Both papers use within-product variation in video uploads across days and examine the impact of these uploads on the usage and sales of promoted products. We attempt to improve upon their anal- ysis by constructing a novel dataset and developing an identification strategy that leverages high- frequency data. By leveraging within-day variation in streamers’ schedules, our approach brings the analysis to the level of granularity at which the simultaneity bias is less plausible.8 Given the increasing availability of high-frequency data, this empirical strategy might prove helpful in future research on influencer marketing. Using our empirical strategy, we find that sponsored streams only marginally increase the popularity of broadcasted games. Our findings therefore question the conventional wisdom that influencer marketing generates much greater returns on investment (ROI) than that of traditional advertising.9 2 Measuring Video Streaming on Twitch 2.1 What is Twitch? Twitch.tv is an Amazon-owned video live streaming platform mostly dedicated to streaming video games and broadcasting esports competitions. The platform has experienced tremendous growth over the last decade. In 2012, it had only several thousand registered channels and 70,000 concur- rent viewers. By 2022, it grew into a video streaming giant that hosts over 100,000 live channels and attracts over 2.5 million viewers at any point in time.10 This growth was mainly driven by 8In this sense, our paper resembles the prior work that measures the effectiveness of TV ads using high-frequency data and discontinuity-in-time research designs (Liaukonyte et al., 2015; He and Klein, 2019). 9Outside of studying the promotional effect, Simonov et al. (2020); Rajaram and Manchanda (2020); Hwang et al. (2021); Ershov and Mitchell (2020) examine characteristics of influencer media and the extent to which these characteristics explain how viewers choose what to watch and whom to follow. 10These aggregate statistics are from twitchtracker.com, a third-party website that monitors the streaming activity on Twitch.tv and reports historical data going back to 2012. 5 Figure 1: A famous streamer Pokimane broadcasting her gameplay of the shooter game Valorant on Twitch.tv. The main window shows the gameplay that Pokimane is broadcasting live. The window in the top left corner shows the streamer’s web camera video. The vertical window on the right is the chat where viewers can send the streamer text messages in real time. the increasing worldwide popularity of esports and was further enhanced by stay-at-home orders of 2020-2022, which made millions of people around the globe look for new entertainment op- tions. At the time of this writing, Twitch remains the largest video game streaming platform in the world, accounting for 63.6% of total hours of content watched and hosting 91.1% of all video game streaming content.11 During peak hours, Twitch often attracts 5-6 million viewers, more than major TV networks such as Fox News, CNN, or MSNBC. Twitch streamers broadcast their gameplay live by sharing their screen and web camera video on the platform. Figure 1 shows what these streams look like from the viewers’ perspective. A typical streamer plays a video game live while commenting on the gameplay and engaging with viewers by replying to their chat messages. Popular Twitch streamers follow different styles of live streaming. Some are professional esports players who focus on streaming one game and impress viewers with their gaming skills. One example is a professional “Fortnite” player Ninja, who once had 650,000 people watching him play Fortnite live (The Verge, 2018). Other streamers, such as Auronplay and Summit1G, stream different games every week to introduce variety to their content and expose viewers to video games of different genres. To make their broadcasts more engaging, streamers often attempt to make their content funny and entertaining (see Figure F.1 for some 11See the Streamlabs and StreamHatchet Quarterly Report for Q3 (2020). https://blog.streamlabs.com/streamlabs- stream-hatchet-q3-live-streaming-industry-report-a49adba105ba 6 notable examples). One such example is a streamer DrDisrespect whose vibrant personality and unique look – mullet hairstyle, 80s-style mustache, and polarized sunglasses – made him one of the most popular streamers on the platform. 2.2 Data on Twitch Streaming and Game Usage To study how Twitch streaming affects the popularity of broadcasted games, we require a dataset that contains information about both live streaming and game usage. We construct a novel dataset by combining several data sources. First, we collect high-frequency data from Twitch by continu- ously monitoring live streaming and game viewership on the platform. These data describe when individual streamers go live, what games they stream, and how many viewers they attract at each point in time. Second, we also collect high-frequency data on the number of people currently play- ing each game. To obtain these data, we continuously monitor Steam, the largest online video game platform in the world. These data help us track changes in the popularity of several hundred games during periods when streamers broadcast these games on Twitch. Finally, we complement these two datasets with information on daily (self-reported) subscription counts of individual streamers, which helps us estimate their hourly income. Video streaming data from Twitch. We monitored video game streaming on Twitch for almost eight months between May 11, 2021 and December 31, 2021. We first pre-selected a list of 60,000 streamers during a three-week pilot period. Each streamer was selected with the probability pro- portional to the total number of viewers they attracted on Twitch during this pilot period (see Ap- pendix A for details). Then, during the following eight months, we sent high-frequency requests to Twitch API to retrieve information about each streamer. Every 10 minutes, we requested the status of each streamer (online or offline), the concurrent number of viewers, the game they were streaming, and the title of each stream. Most streamers went live on Twitch at least once during the sample period, so our sample covers 96.8% of the streamers we attempted to track (58,060 out of 60,000). Tracking streamers at high-frequency enables us to record the exact times at which a streamer starts and ends broadcasting any game, which serves as the main input to our empirical strategy (see Section 3). Table 1 summarizes live streaming activity on Twitch, both overall and for the most popular streamers. Throughout the paper, we measure streamer popularity using the average number of concurrent viewers. In the last two rows, the table reports averages (1) across top 5% most popular streamers, and (2) across all 58,060 streamers. The average streamer attracts only around 150 viewers and streams about 5.4 hours per day conditional on working on that day (14.6 hours per week). By contrast, the top 15 streamers attract over 30,000 viewers at any given time and often 7 Table 2: Streaming activity, usage, and attributes of games. Mean Std.dev P5 P25 P50 P75 P95 Panel A. Streaming activity, viewership, and usage (total during the observation period) No. Streams (All Streamers) 9,079 42,206 14 178 959 4,481 31,464 No. Streams (Top Streamers) 666 3794 2 15 61 239 2093 Hours Streamed (thousands) 28.2 135.8 0 0.4 2.2 12.5 100.8 Hours Viewed (thousands) 6,842 52,482 5 72 394 1,689 17,109 Hours Played (thousands) 30,165 167,518 5 252 1,710 12,267 101,523 Panel B. Streaming activity, viewership, and usage (per day) No. Streams (All Streamers) 32.8 152.4 0.1 0.6 3.5 16.2 113.6 No. Streams (Top Streamers) 2.4 13.7 0.0 0.1 0.2 0.9 7.6 Hours Streamed 101.7 490.3 0.1 1.3 8.0 45.0 363.8 Hours Viewed 24,700 189,467 19 262 1,423 6,097 61,767 Hours Played 130,585 725,187 23 1,092 7,406 53,105 439,493 Panel C. Game attributes Publisher Size (no. games) 4.9 5.6 1.0 1.0 2.0 9.0 17.0 Years Since Release 3.9 4.2 0.1 0.8 2.7 5.5 12.5 Rating Metacritic Score 78.8 9.7 62.0 75.0 80.0 85.0 91.0 Regular Price (dollars) 21.1 16.3 0.0 10.0 20.0 30.0 60.0 Customer Rating St. Dev. 2.3 0.7 1.0 1.9 2.4 2.8 3.1 This table shows streaming, viewership, and usage statistics for 599 Steam games in our sample. To estimate the number of people watching a stream at any given point in time, we multiply the number of current viewers obtained from Twitch API by 10 minutes (the frequency of data collection). We construct the streaming time statistics in a similar fashion. The variable “years since release” captures the number of years passed between the official game release and the first day of our data collection. The publisher size is the number of games a publisher released among 599 titles in our sample. The regular price is the 95-th percentile of the distribution of daily prices for a given game, which usually corresponds to the non-discounted price. of broadcasting on Twitch. To this end, we gathered additional data on the number of active subscribers of each streamer. Because viewer subscriptions represent a significant share of stream- ers’ regular income, the estimate we obtain gives us an informative lower bound on the daily and hourly income of top streamers (see Appendix A.2 for details). Twitch does not publish any official data on subscriptions, so we instead obtained subscription data from a third-party website twitch- tracker.com, which tracks the number of active subscriptions for Twitch streamers who chose to publicly disclose this information. By tracking 10,000 most-subscribed streamers on a daily basis, we collected the current number of active subscriptions and its breakdown by subscription type (i.e., Tier 1, Tier 2, Tier 3, or Amazon Prime), each of which has a fixed dollar value per subscrip- tion. We then converted these subscription counts into daily and monthly income estimates. The resulting estimates capture pre-tax income after the streamer has paid the commission to Twitch. 10 The main limitation of these data is that streamers self-select into disclosing subscription counts, so we cannot assume our dataset on subscription revenues includes a random subsample of streamers. We find that the average streamer in our sample has 983 active subscriptions and earns around $3,800 per month in subscription revenues. In contrast, the average top 5% streamer earns about $20,000-30,000 in subscription revenues each month. Dividing their monthly subscription income on streaming hours, we obtain that the average top streamer earns about $144 per hour of live streaming. In Section 4.4, we use this number as an estimate of the hourly income of top streamers. 3 The Effect of Stream Viewership on Game Usage 3.1 Empirical strategy We aim to estimate the causal effect of Twitch stream viewership on the broadcasted games’ pop- ularity, defined as the number of concurrent players. In an ideal experiment, we would make streamers broadcast random games at random times of the day and measure the corresponding lift in the number of concurrent players. Such an experiment is impossible to implement because nei- ther Twitch nor we can control when streamers go live and which games they broadcast. As Seiler et al. (2017) point out, this lack of experimental variation makes it difficult to measure the effect of organic content on product popularity. We adopt an instrumental variable strategy that mimics the ideal experiment and control for confounding factors using fixed effects. We leverage our high-frequency data and focus on vari- ation in the exact broadcast schedules of top streamers within a given day. Although streamers can strategically decide which days to work on and which games to broadcast, our main identify- ing assumption is that within a given day, their streaming schedules do not respond to real-time changes of game popularity. This assumption is reasonable in the context of Twitch streaming. Instead of working regular hours, many streamers start working whenever it is convenient for them given other demands on their time, such as university classes and part-time jobs. They may fin- ish broadcasting earlier than planned due to fatigue or later than planned if events that occur in the game lead to a longer gaming session. Because of these idiosyncratic decisions, many Twitch streamers follow irregular streaming schedules when broadcasting a given game. By leveraging this unpredictable variation in streaming schedules, we estimate how Twitch viewership affects game popularity. We now present model-free evidence that supports this empirical strategy. We first demonstrate that individual streamers follow irregular broadcast schedules. Figure 2 visualizes the variation in broadcast schedules of three streamers, randomly drawn from the pool of the 5% most popular streamers (“top streamers”). In each graph in Figure 2, a row corresponds to 24 hours of a given day, and square markers indicate whether the streamer was live on Twitch in 11 Figure 2: Daily work schedules of top Twitch streamers. This figure visualizes the daily work schedules of three streamers, which we randomly selected from the pool of 5% most popular Twitch streamers. Each of the three graphs visualizes the time slots in which each person was streaming live on Twitch (colored squares), with the horizontal axis showing the time of the day and the vertical axis showing different dates during the first month of our sample. We use color to depict whether a streamer was broadcasting their primary Steam game (dark orange), or some other game (light blue), or was “just chatting” (light gray). each hour of that day. Marker colors represent primary Steam games (dark orange), other games (light blue), and “just chatting” with the audience (light gray). The figure shows that streamers xQcOW and Ibai have haphazard schedules that change every day. They often shift their work schedules by several hours and switch between games. Streamer Sykkuno follows a more stable schedule, but even his schedule exhibits substantial variation across days. In the Appendix, we visualize schedules of several other top streamers and show that most of them have highly variable streaming schedules, similar to the three examples here (see Figure F.4). Table 3 further summarizes this variation in broadcast schedules by reporting the standard de- viations of start times, end times, and stream durations. We decompose this variation into three components: (a) variation across games, (b) variation across streamers within a game, and (c) vari- ation across dates within a streamer-game combination. Within-streamer-game variation explains 50-60% of the total variation of broadcast timing. In contrast, about 40% of the total variation is across streamers for a given game and less than 10% is across games. This decomposition confirms that individual streamers follow variable schedules and frequently change the timing of their live broadcasts within a day, thus further supporting our empirical strategy. Next, we show that, when a top streamer starts to broadcast a game, both Twitch viewership 12 or new version releases, which may affect both viewership and game usage. The game-hour of the day fixed effects µ j,h(t) account for predictable within-day shifts in game popularity, which might occur if different games are played from different time zones (see Figures F.2-F.3 in the Appendix). Finally, time fixed effects ηt , common across games, capture unobserved events that affect the opportunity cost of time of both players and streamers, such as the effects of holidays or major sports events. In this model and the subsequent analysis, we aggregate data to one-hour time intervals to reduce computational complexity. We obtain almost identical estimates when using the original 10-minute time intervals (see Appendix C.1). The variable Vjt in (1) represents the total Twitch viewership stock for game j in time t. One can think of Vjt as measuring the cumulative amount of time that viewers have recently spent watching the streams of game j on Twitch, which might cause them to play the game. We define Vjt as a weighted sum of the recent number of viewers with geometrically decaying weights: Vjt = T ∑ τ=0 δ τviewers j,t−τ . (2) where viewers jt is the total number of people watching Twitch streams of game j in time t, and δ is between zero and one. In estimation, we assume T = 72, which allows the effects of viewership to persist for up to three days. This geometric weight specification is similar to the models used to capture persistent advertising effects in the prior work (Shapiro et al., 2020). When put together, equations (1) and (2) yield a model in which an increase in Twitch viewership can influence game usage both concurrently and in future periods. The parameters of interest are β and δ . We interpret β as the elasticity of game usage with respect to the cumulative viewership stock Vjt . We refer to β as the streaming elasticity. By contrast, we interpret δ as the decay parameter that captures the persistent effects of viewership. Some viewers might immediately download the game once they see it on Twitch (an immediate effect), while others might research the game later, or they can immediately download it but keep playing it after the stream (a carry-over effect). The decay parameter δ captures the magnitude of the carry-over effect relative to the immediate effect. The primary source of identification is within-day variation in the broadcast schedules of top streamers. Having controlled for game-date fixed effects λ j,d(t) and for other fixed effects in (1), we ask how these live broadcasts drive the game’s viewership on Twitch, and how a lift in view- ership translates into changes in game usage, both during and after the broadcast. To this end, we construct a vector of instruments Z jt = { z j,t ,z j.t−1, . . . ,z j,t−12 } that capture how many top stream- ers broadcast game j in time t and in the 12 hours preceding t (Appendix C.1 shows that our results are robust to including a different number of lagged variables z jt). As we show in Appendix B.1, most variation in these instruments comes from days on which a game was either broadcasted by 15 one top streamer or not broadcasted by top streamers at all. The main identifying assumption is that, conditional on fixed effects, the broadcast decisions of top streamers in the past 12 hours are orthogonal to idiosyncratic shocks in game popularity, ε jt , so that E [ ε jt |Z jt ,λ j,d(t),µ j,h(t),ηt ] = 0. (3) One might worry that streamers strategically schedule their broadcasts to coincide with significant in-game events (e.g., new version releases or tournaments), which may create a correlation between the instruments Z jt and shocks ε jt even within a day. However, if this was the case, the number of players would increase even before a top streamer goes live, reflecting that the game is already trending up on Twitch by the time the live stream starts. By contrast, Figure 3 shows that the number of players remains relatively constant before the stream and increases sharply right after the start of the stream. We also demonstrate in Appendix C.1 that controlling for game-week instead of game-date fixed effects produces similar estimates of β and δ , suggesting that streamers do not strategically choose their broadcast days within a week based on game-specific demand shocks. By implication, it might be even less likely that they respond to these demand shocks when choosing the exact broadcasting time within a day. We therefore do not find any indication that within-day endogeneity poses a substantial threat to our empirical strategy. The identifying assumption in (3) implies the condition E [ ε jtZ jt |λ j,d(t),µ j,h(t),ηt ] = 0, which we use to estimate parameters β and δ . Specifically, we minimize the sum of squared interactions between residuals ε jt and instruments Z jt : (β̂ , δ̂ ) = arg min (β ,δ ) ∑ j ∑ t Z′jtZ jt ( log ( 1+ players jt ) −β log ( 1+Vjt (δ ) ) −λ j,d(t)−µ j,h(t)−ηt )2 . (4) This objective function corresponds to GMM estimation with the identity weighting matrix, which under the assumption (3) yields consistent estimates of β and δ . To solve the minimization problem in (4), we perform a golden-section search for the parameter δ , and for a given guess of δ , we estimate parameter β using a closed form 2SLS formula (see Appendix B.2 for details). We obtain clustered standard errors via bootstrap by sampling game-date combinations with replacement. 3.3 The average effect of Twitch streaming Table 4 presents parameter estimates from the model in (1). The first column shows the estimates from an OLS regression that assumes away persistent effects (i.e., setting δ = 0) and does not include any fixed effects. Because viewership and game usage are highly correlated, the OLS estimate without controls returns a high estimated elasticity of 0.606. The next two columns show 16 Table 4: The effect of Twitch viewership on video game usage Variable Parameter OLS IV IV Log Viewership Stock Vjt β 0.606*** 0.015*** 0.033*** (0.002) (0.002) (0.009) Effect Persistence δ 0.828*** (0.060) Game-Date FE No Yes Yes Game-Hour of day FE No Yes Yes Time FE No Yes Yes Observations 3,277,728 3,277,728 3,277,728 Column 1 shows results from an OLS regression that fixes the persistence parameter to zero (δ = 0) and does not control for any fixed effects. Columns 2-3 shows results from our main specification in (1), without persistence (column 2) and with persistence (column 3). Bootstrap standard errors are clustered at the game- date level. *, **, and *** represent significance at the 10%, 5%, and 1% level. IV estimates obtained using the GMM estimator in equation (4). Column 2 reports IV estimates from a specification that sets persistence to zero (δ = 0), which yields an elasticity estimate of 0.015. This estimate is much lower than that from the OLS regression, consistent with the idea that including instruments Z jt and fixed effects helps remove the simultaneity bias. But because this specification ignores persistent effects, it might still generate a biased estimate of the streaming elasticity β . For example, if the true effect persists several hours after a stream, this model will fail to attribute the elevated post-stream game usage to the effect of the broadcast and might bias the estimated β toward zero. Consistent with this observation, we find that allowing for persistent effects raises the estimated streaming elasticity from β̂ = 0.015 to β̂ = 0.033 (column 3 in Table 4). We also estimate the persistence parameter to be δ̂ = 0.828, suggesting that the initial effect becomes 17% weaker in every subsequent hour and dissipates to 15% of its initial magnitude within about ten hours. The estimated streaming elasticity of 0.033 is broadly in line with the previous work on the ef- fects of word-of-mouth and advertising. For example, Seiler et al. (2017) estimate the elasticity of 0.016 when studying how the number of organic comments on a microblogging platform increases the viewership of discussed TV shows. Their estimates can be compared to ours because they study the impact of word-of-mouth activity, which is analogous to Twitch viewership in our analysis, and they also adopt a measure of consumption (i.e., TV show audience) as the main outcome variable. Similarly, Shapiro et al. (2020) report the mean elasticity of 0.023 and the median elasticity of 0.014 when estimating the effect of TV advertising viewership on the sales of packaged goods.14 14Seiler et al. (2017) find no strong evidence of persistent effects. Since their estimation leverages one specific shock, i.e., the blocking of microblogging platform Sina Weibo in China, these results might reflect that the persistent effects are difficult to identify from their data. Shapiro et al. (2020) use a similar construction of viewership stock 17 5 shows, organic streams are four times as effective as sponsored streams, with the estimated elasticities of 0.030 and 0.007. We find similar results when we consider sponsored streams funded by official partnership programs (estimated elasticities 0.032 vs 0.006). We draw several conclusions from these results. First, organic live streams have positive ef- fects on the short-term popularity of video games. This finding contradicts the belief of some practitioners in the video game industry that live streams divert consumers from playing games by providing an alternative source of entertainment and allowing them to experience the gameplay without paying (Johnson and Woodcock, 2019). In other words, we find that live streaming and gaming should be viewed as complementary leisure activities rather than substitutes. Second, our results show that sponsored streams are relatively ineffective at bringing additional players to the broadcasted games. Later in Section 4.4, we use our elasticity estimates to compute the return on investment (ROI) and find that paying for sponsored streams is not profitable for the average game. Our findings therefore question the conventional wisdom that influencer marketing generates much greater ROI than that of traditional advertising. Although Twitch influencer promotions generate negative returns on average, they might still be effective at promoting relatively unknown games. Given the long format of Twitch broadcasts, such promotions might also reveal rich information about the game’s price, quality, and gameplay. In the next section, we empirically explore these conjectures and study what kind of games might substantially benefit from being promoted in sponsored streams on Twitch. 4 Which Games Benefit the Most from Twitch Streaming? 4.1 Overview of potential mechanisms To understand whether sponsored streams might be effective at promoting certain games, we need to first understand what kind of games are most likely to benefit from live broadcasts. To answer this question, we need to first understand what mechanisms drive the effect of streaming on game popularity. We hypothesize that Twitch streams either inform consumers about the existence of the broadcasted games or reveal information about their price, quality, or gameplay. First, consumers face an enormous choice set and might not be aware of all offered titles. On Steam alone, they encounter an assortment of more than 60,000 games, which grows with thousands of new titles introduced every year. Twitch streams can draw consumers’ attention to specific games, thus generating an awareness effect (Honka et al., 2017; Tsai and Honka, 2021). An example of such an awareness effect is the game “Among Us” by a small indie studio, which stayed dormant on Steam for almost two years and only became popular when consumers learned about it from Twitch broadcasts. Additionally, even if consumers are aware of a game, they might learn something about 20 its price, quality, or gameplay by watching streamers play it live on Twitch. For example, while many consumers know that “Dark Souls” is a monster fighting game, they might realize after watching Twitch streams that this game stands out with its intricate level design and combat depth. In other words, live streams might serve as informative advertisements that reveal the horizontal and vertical attributes of broadcasted games (Grossman and Shapiro, 1984; Ackerberg, 2003). If streams provide any information at all, they should have stronger effects on relatively un- known games. To proxy consumer knowledge, we assume that consumers are less informed about new games and games released by small publishers, who lack the reputation and marketing bud- gets of large publishers. We start by testing whether the streaming elasticity β negatively correlates with game age and publisher size. To understand what information consumers might learn after watching Twitch streams, we also study whether β varies across games with different vertical game attributes, such as price and quality. We measure quality using Metacritic ratings, a widely recognized quality metric analogous to Rotten Tomatoes ratings for movies. Streams may either directly reveal information about quality, or they can encourage consumers to research the game and learn its quality. A similar effect may arise with prices: although our informal observation suggests that streamers rarely talk about game prices during live streams, consumers might still learn the game’s price while doing their own research after the stream. Both mechanisms, direct and indirect, would make consumers more likely to adopt inexpensive or high-quality games after seeing them on Twitch. Finally, we analyze whether streaming disproportionately benefits “niche” games that strongly appeal to some consumers despite the mediocre quality. We proxy “niche” games by using the standard deviation of customer ratings on Metacritic. 4.2 Estimates from median sample splits We start by subsampling games and comparing the estimated streaming elasticities across subsam- ples. Table 6 presents the estimated streaming elasticities β̂ from the main specification in (1)-(2) for different subsamples. Throughout this section, we simplify estimation by fixing the persistence parameter at the level estimated in Section 3 (δ̂ = 0.828). In rows 1-4 of Table 6, we estimate stream elasticities by game age and publisher size. To this end, we define new games as those released within 2.7 years prior to our sample period. We additionally define small publishers as those who only sell one game from our sample of 599 titles. Both definitions roughly correspond to the median splits of variables “game age” and “publisher size.” We find that new games have a somewhat higher estimated elasticity, although the difference is small and not statistically significant (elasticities of 0.034 vs 0.032). Nevertheless, we find that the games produced by small publishers benefit from streaming almost three times more than the games of large publishers (elasticities of 0.052 vs 0.019). This result is consistent with the 21 Table 6: Streaming elasticities by game characteristics Log viewership stock Vjt Number of observations β̂ Estimate β̂ S.E. No. games No. obs. By game age: New games (<2.7 years old) 0.034*** (0.004) 299 1,636,128 Old games (≥2.7 years old) 0.032*** (0.008) 300 1,641,600 By publisher size: Small publisher (1 game) 0.052*** (0.011) 283 1,548,576 Large publisher (2+ games) 0.019*** (0.003) 316 1,729,152 By price: Inexpensive (<$20) 0.044*** (0.010) 281 1,537,632 Expensive (≥$20) 0.023*** (0.004) 318 1,740,096 By quality: High quality (Metacritic score >80) 0.041*** (0.010) 219 1,198,368 Low quality (Metacritic score ≤80) 0.028*** (0.004) 247 1,351,584 By rating variance: Niche games (rating std. >2.4) 0.050*** (0.010) 212 1,160,064 Mainstream games (rating std. ≤2.4) 0.008*** (0.002) 213 1,165,536 This table presents the estimates of streaming elasticity β̂ for games with different characteristics, holding the persistence parameter δ at the level estimated in Table 4. All specifications include game-date, game- hour of day, and time fixed effects. Standard errors are clustered at the game-date level. *, **, and *** represent significance at the 10%, 5%, and 1% level. information effect, considering that small publishers have modest advertising budgets and do not get the same media coverage as big conglomerates like EA Games and Ubisoft. Consumers might therefore be unaware of these publishers’ games, and Twitch broadcasts might break this awareness barrier. We further study what kind of information consumers acquire from Twitch broadcasts. Rows 5-8 of Table 6 show that Twitch streams are more effective for inexpensive games and for high- quality games. We define inexpensive games as having the regular price below the median level of $20. We estimate an elasticity of 0.044 for inexpensive games, twice larger than the elasticity for expensive games. Similarly, we obtain a higher estimated elasticity for high-quality games, defined as games whose Metacritic ratings are above the median. The estimated elasticity is about 50% larger for high-quality than low-quality games (elasticities of 0.041 vs 0.028). These results suggest that by watching live streams, consumers acquire information – directly or indirectly – about the vertical attributes of broadcasted games. Twitch streams might also help consumers understand whether the game matches their idiosyn- cratic preferences. This mechanism might be especially relevant for low or medium-quality games, whose mediocre ratings reflect that not all consumers enjoy them. These might be “niche” games that appeal to some consumers but leave others indifferent. Using the standard deviation of user ratings to proxy “niche” games, we compare the estimated elasticity for games above and below the median value of this proxy. Rows 9-10 of Table 6 show that niche games have a much higher streaming elasticity of 0.050. On the other hand, Twitch broadcasts barely affect mainstream games 22 (A) By publisher size and game age 0.020 0.025 0.030 0.035 0.040 0.045 0.050 0.055 1 2 4 8 16 0.1 0.25 0.5 1 2 4 8 number of games by publisher ye ar s si nc e re le as e 0.25 1.00 4.00 16.00 1 2 4 8 16 number of games by publisher ye ar s si nc e re le as e (B) By Metacritic rating and price 0.02 0.04 0.06 0.08 0.10 65 70 75 80 85 90 0 5 10 20 40 60 metacritic rating re gu la r pr ic e 2 8 32 50 60 70 80 90 metacritic rating re gu la r pr ic e (C) By Metacritic rating and std.dev. of the user rating 0.02 0.03 0.04 0.05 0.06 0.07 0.08 65 70 75 80 85 90 1.5 2.0 2.5 3.0 metacritic rating st de v. o f c on su m er r at in g 1.5 2.0 2.5 3.0 50 60 70 80 90 metacritic rating st de v. o f c on su m er r at in g Figure 5: Estimated streaming elasticities from Generalized Random Forests. Each graph on the left visualizes the estimated function β̂ (X j) by focusing on two dimensions at a time, while holding all other attributes X j fixed at their average levels (see details in Appendix D.4). Each figure on the right shows the empirical distribution of the same two game attributes and presents the contour lines of the estimated density function. 25 Panel B shows the estimated elasticities β̂ by game price and rating. We find higher elasticities for inexpensive games, especially those priced under $5, which is consistent with our earlier anal- ysis. The density plot on the right suggests that this pattern is mostly driven by the high streaming elasticities of free games. As for the ratings, we find a nonlinear pattern whereby games with rat- ings around 80-85, slightly above the median, have the highest estimated elasticities. Somewhat counterintuitively, estimated elasticities drop as the rating approaches 90-95. We observe a similar pattern in Panel C. It is possible that games with high ratings already get extensive media coverage and therefore do not benefit from additional exposure on Twitch, although we cannot directly test this hypothesis with our data. Panel C visualizes the estimated elasticities by Metacritic expert rating and the standard devi- ation of consumer ratings. Consistent with our previous results, we find that games with highly dispersed consumer ratings benefit more from Twitch streaming than games with more uniform consumer ratings. This result suggests Twitch streams reveal the gameplay of the broadcasted game, thus helping consumers understand whether the game matches their preferences. When put together, these results suggest that despite the modest average streaming elasticity, a small fraction of games benefit considerably from streaming. For example, Twitch streams might be effective at promoting games by lesser-known publishers, informing consumers about appealing game attributes (e.g., low price), or promoting niche games that appeal to a small group of con- sumers. In fact, the promotional effects for some games might be sufficiently high for sponsored streams to generate positive ROI. We now demonstrate this point by calculating ROI separately for each game. 4.4 When is it profitable to sponsor live streams? To calculate the implied returns on investment, we consider a counterfactual in which a publisher pays top streamer a fixed fee for broadcasting a game for one hour. Using the estimated streaming effects β̂ (X j) from equation (6), we predict how many new players will be brought into the game in this hour of streaming and how this increase will affect the expected profits of the sponsoring publisher. We predict the increase in the expected profit as Profit lift j = Conversion Rate×∆Players j×Profit Margin j−Streaming Costs (7) where the first term captures the expected revenue lift and the second term is the fixed fee that the publisher pays for one hour of streaming. We calculate the lift in the number of players, ∆players j, by combining the game-specific streaming elasticity β̂ (X j) with the game-specific lift in the number of viewers generated by the broadcast of a top streamer. Since sponsored streams 26 Players brought to the game by a sponsored stream % lift in number of players due to the sponsored stream F re qu en cy 0 50 10 0 15 0 20 0 0% 5% 10% 15% Predicted revenue increase from a sponsored stream Predicted revenue increase (dollars) F re qu en cy 0 200 400 600 800 1000 0 50 10 0 15 0 Hourly fee $144 Figure 6: Is It Profitable to Sponsor Live Streams? The figure shows the predicted increase in the number of active players (top panel) and in the expected dollar revenues (bottom panel) from a one-hour live stream on Twitch. The predicted increase the number of active players in the top panel is computed using the formula (11), whereas the predicted revenue increase in the bottom panel corresponds to the left-hand side expression in the profit lift formula (7). are less effective than organic ones, we discount the streaming elasticity by a factor of 0.233 = 0.007/0.030, the ratio between the average sponsored and organic stream elasticities in Table 5. See Appendix E for details on how we compute ∆players j. To obtain the profit margin of paid games, we assume that a per-unit profit margin equals 70% of the game’s price because Steam charges a 30% fee for publishing games on its platform. Although some games in our sample are free, they often bring comparable margins to their publishers via in-game transactions. To compute the profit margin for such games, we assume that their sales revenues are equal to the median sales revenue among the paid games. The conversion rate in (7) reflects the share of players brought into the game by the stream who will buy a game copy. Because we do not have sales data to estimate the conversion rate, we need 27 References ACKERBERG, D. A. (2003): “Advertising, learning, and consumer choice in experience good markets: an empirical examination,” International Economic Review, 44, 1007–1040. ALDRIDGE, T. (2019): “Twitch for Game Developers,” Twitch Official Blog. ALLEN, J. (2022): “Lost Ark helps Asmongold beat Twitch viewer records,” NME News. ATHEY, S., J. TIBSHIRANI, AND S. WAGER (2019): “Generalized random forests,” The Annals of Statistics, 47, 1148–1178. BAILIS, R. (2019): “The state of influencer marketing: 10 influencer marketing statistics to inform where you invest,” Big Commerce. BREIMAN, L. (2001): “Random forests,” Machine learning, 45, 5–32. CORREIA, S. (2016): “A feasible estimator for linear models with multi-way fixed effects,” Preprint at http://scorreia. com/research/hdfe. pdf. EFRON, B. AND R. J. TIBSHIRANI (1994): An introduction to the bootstrap, CRC press. ERSHOV, D. AND M. MITCHELL (2020): “The effects of influencer advertising disclosure regu- lations: Evidence from instagram,” in Proceedings of the 21st ACM Conference on Economics and Computation, 73–74. FENLON, W. (2020): “How Among Us became so wildly popular,” PC Gamer. GAURE, S. (2013): “OLS with multiple high dimensional category variables,” Computational Statistics & Data Analysis, 66, 8–18. GEYSER, W. (2021): “The state of influencer marketing 2021: Benchmark report,” Influencer Marketing Hub. GOODMAN, L. (2021): “How much do Twitch streamers make?” Stream Scheme. GORDON, B. R., F. ZETTELMEYER, N. BHARGAVA, AND D. CHAPSKY (2019): “A comparison of approaches to advertising measurement: Evidence from big field experiments at Facebook,” Marketing Science, 38, 193–225. GREENBAUM, A. (2020): “Nintendo continues to target streamers,” SVG News. GROSSMAN, G. M. AND C. SHAPIRO (1984): “Informative advertising with differentiated prod- ucts,” The Review of Economic Studies, 51, 63–81. GUIMARAES, P. AND P. PORTUGAL (2010): “A simple feasible procedure to fit models with high-dimensional fixed effects,” The Stata Journal, 10, 628–649. HE, C. AND T. J. KLEIN (2019): “Advertising as a reminder: Evidence from the Dutch State Lottery,” . 30 HONKA, E., A. HORTAÇSU, AND M. A. VITORINO (2017): “Advertising, consumer awareness, and choice: Evidence from the US banking industry,” The RAND Journal of Economics, 48, 611–646. HWANG, S., X. LIU, AND K. SRINIVASAN (2021): “Voice Analytics of Online Influencers – Soft Selling in Branded Videos,” Available at SSRN 3773825. JOHNSON, G. A., R. A. LEWIS, AND E. I. NUBBEMEYER (2017): “Ghost ads: Improving the economics of measuring online ad effectiveness,” Journal of Marketing Research, 54, 867–884. JOHNSON, M. R. AND J. WOODCOCK (2019): “The impacts of live streaming and Twitch. tv on the video game industry,” Media, Culture & Society, 41, 670–688. LI, I. (2022): “Sponsorship Disclosure in Livestreaming: The Behavior of Streamers on Twitch,” Working Paper. LI, N., A. HAVIV, AND M. J. LOVETT (2021): “Digital Marketing and Intellectual Property Rights: Leveraging Events and Influencers,” Available at SSRN 3884038. LIAUKONYTE, J., T. TEIXEIRA, AND K. C. WILBUR (2015): “Television advertising and online shopping,” Marketing Science, 34, 311–330. LODISH, L. M., M. ABRAHAM, S. KALMENSON, J. LIVELSBERGER, B. LUBETKIN, B. RICHARDSON, AND M. E. STEVENS (1995): “How TV advertising works: A meta-analysis of 389 real world split cable TV advertising experiments,” Journal of Marketing Research, 32, 125–139. LOVETT, M. J. AND R. STAELIN (2016): “The role of paid, earned, and owned media in building entertainment brands: Reminding, informing, and enhancing enjoyment,” Marketing Science, 35, 142–157. RAJARAM, P. AND P. MANCHANDA (2020): “Video Influencers: Unboxing the Mystique,” arXiv preprint arXiv:2012.12311. SCULLION, C. (2021): “The entirety of Twitch has reportedly been leaked,” Video Games Chron- icle. SEILER, S., S. YAO, AND W. WANG (2017): “Does online word of mouth increase demand?(and how?) evidence from a natural experiment,” Marketing Science, 36, 838–861. SEILER, S., S. YAO, AND G. ZERVAS (2018): “Causal inference in word-of-mouth research: Methods and results,” Working Paper. SHAPIRO, B., G. J. HITSCH, AND A. TUCHMAN (2020): “Generalizable and robust tv advertising effects,” . SIMONOV, A., R. URSU, AND C. ZHENG (2020): “Do suspense and surprise drive entertainment demand? evidence from twitch. tv,” Evidence from Twitch. tv (October 19, 2020). 31 TSAI, Y.-L. AND E. HONKA (2021): “Informational and Noninformational Advertising Content,” Marketing Science, 40, 1030–1058. VINCENT, J. (2018): “Drake Drops in to Play Fortnite on Twitch and Breaks the Record for Most- Viewed Stream,” The Verge. WILDE, T. (2020): “Should streamers pay game developers to stream their games?” PC Gamer. YANG, J., J. ZHANG, AND Y. ZHANG (2021): “First Law of Motion: Influencer Video Advertising on TikTok,” Available at SSRN 3815124. 32 B Additional Estimation Details Distribution of daily averages across games Mean S.E. Q 5% Q 25% Q 50% Q 75% Q 95% No. unique streamers 2.29 12.49 0.01 0.05 0.22 0.85 8.12 Max no. streamers live 1.80 5.33 1.00 1.00 1.01 1.14 4.30 Stream duration (hrs) 2.89 1.86 0.81 1.67 2.58 3.75 5.78 Appendix Table B.1: Broadcasting activity of top streamers across different games. These summary statistics illustrate the variation we isolate with instrumental variables z jt , which measure how many top 5% Twitch streamers are broadcasting game j in time period t. We first compute the daily averages across all days in our sample and then report the distribution of these averages across 599 games. B.1 Variation captured by instruments z jt In this section, we describe variation isolated by instruments z j,t ,z j.t−1, . . . ,z j,t−12 in Section 3.2. Table B.1 describes the variation in the number of active top 5% streamers captured by these instruments. In this table, we compute the daily numbers of unique streamers broadcasting a specific game, maximum number of streamers broadcasting simultaneously at any point in time, and average stream duration. We then report the distribution of these game-specific averages in Table B.1. As these statistics show, the average game is broadcasted by only 2-3 unique streamers on a given day and is broadcasted by no more than 1-2 streamers at the same time. Additionally, when top streamers do broadcast a game, those broadcasts are on average almost three hours long. This long average stream time suggests that most top streamers spend several hours exploring a given game and showcasing its gameplay rather than picking it up for a few minutes during breaks in their main activity. Lastly, we note a median game in this sample is never broadcasted by more than one streamer at any given time. Therefore, for a typical game, our instrument zi j captures the variation between time periods when nobody broadcasts game j and other periods when one of the top streamers picks up a game and plays it for several hours. B.2 Grid search algorithm for δ and β To solve the minimization problem in (4), we use the following algorithm. We use a golden-section algorithm for the persistence parameter δ starting from a wide interval (0.001,0.999) and termi- nating search when the length of the search interval falls below the tolerance level 0.01. Given a candidate value of δ , we compute a point estimate for the parameter β using the standard closed form 2SLS formula. The main challenge is how to deal with three high-dimensional fixed effects 35 (game-date, time interval, and game-hour of the day). Given that our sample includes approxi- mately 600 games and 5,300 one-hour time intervals, we need to include approximately 150,000 million fixed effects. To handle the problem of this scale, we use packages reghdfe and ivreghdfe that rely on an iterative algorithm that was proposed by Guimaraes and Portugal (2010) and was further optimized by Correia (2016). This algorithm relies on a simple fixed-point iteration prin- ciple whereby all regression coefficients are partitioned into groups (e.g., by the class of fixed effects), and the algorithm iterates group-specific first order conditions while fixing the values of coefficients in all other groups. Guimaraes and Portugal (2010) show that this algorithm converges to the correct least squares estimates but manages memory more efficiently than the standard esti- mators. By combining this algorithm with the golden-section search described above, we obtain point estimates of parameters β and δ . To compute standard errors clustered at the game-date level, we perform a block bootstrap that samples game-date pairs from the original sample with replacement and draws in total 50 bootstrap samples (Efron and Tibshirani, 1994, p. 86). For specifications in Sections 3-4 in which we do not estimate δ , we do not use bootstrap and instead obtain standard errors using the standard asymptotic theory of the 2SLS estimators. 36 C Robustness Analyses C.1 Alternative specifications We explore the robustness of our main results from Section 3 with respect to the (a) included fixed effects, (b) definition of the instrument Z jt , (c) definition of the time period t, and (d) sample def- inition. Table C.1 presents the estimates β̂ and δ̂ for different specifications. Rows 1-4 show that our results are robust to controlling for game-week fixed effects and changing the definition of the instrument Z jt . Our initial motivation for including game-date fixed effects was that they allow us to better control for unobserved game-specific events. Nevertheless, the estimate β̂ barely changes when we instead use game-week fixed effects (an estimate of 0.034 vs 0.033). This result suggests that, even if game popularity changes within a given week, streamers do not systematically sched- ule their live streams on days when a game is trending. Similarly, constructing the instrument Z jt using a different number of lagged values z jt does not seem to affect our estimates. Row 5 shows how the estimates change when we switch from 1-hour time to 10-minute time intervals t. While this switch makes the estimation computationally burdensome, it changes our results only marginally. We obtain the estimated elasticity of 0.032 and an implied hourly persistence param- eter of (0.956)6 = 0.763, which are reasonably similar to our main specification. Row 6 shows that dropping games that are never broadcasted by the top 5% streamers (i.e., zero variation in the instrument Z jt) returns estimates 0.034 and 0.849, which, once again, are close to those in the main specification. Finally, in row 7 we explore whether our functional form assumptions in (1), especially the log expressions log(1+ players jt) and log(1+Vjt), generate bias. When removing all games that have on average less than 10 concurrent viewers or less than 10 concurrent players, we obtain results similar to our main specification. This finding suggests that our estimates are unlikely to be mainly driven by the functional form assumptions. Note, however, that we find a somewhat higher estimate of the elasticity β (0.042 vs 0.033 in the main specification), likely because we are focusing on more popular games that are likely to be of higher quality. Appendix Table C.1: Robustness analyses for the main specification in Section 3. Specification Elasticity β̂ Persistence δ̂ (1) Main specification 0.033 0.828 (2) Main + game-week FEs 0.034 0.794 (3) Main + 6 lags in the instrument Z jt 0.035 0.831 (4) Main + 18 lags in the instrument Z jt 0.033 0.831 (5) Main + 10-minute intervals t 0.032 0.956 (6) Main + drop games without Z jt variation 0.034 0.849 (7) Main + only non-zero players jt and viewers jt 0.042 0.897 37 random splitting of trees in random forest algorithms. We estimate the heterogeneous streaming elasticities using a range of tuning parameters that take values n× 5,471, with n = 1,2, ...,8. We find that choosing a high leaf size regularizes the tail end of the β̂ distribution, whereas choosing a low leaf size leads to noisier results in our validation exercise (see section D.3). We pick n = 2 to balance between the two considerations. We note, however, that our main qualitative findings change little when we move to other values of the tuning parameter. D.3 Validation of GRF estimates We validate the generalized random forest estimates in two ways. First, we compute the average elasticities that GRFs predict for games above and below the median of each game attribute X j, and we compare these elasticities to our median split results from Section 4.2. To this end, we compute elasticities β̂ j predicted by the generalized random forests for the exact same subsamples. If the generalized random forests recover the true heterogeneity in streaming elasticities, the average predicted elasticities should be roughly the same as our median split 2SLS estimates in Section 4.2. Instead, if the forest does not capture meaningful heterogeneity, the average predicted elasticities need not be aligned with the 2SLS estimates. Table D.1 compares these two sets of estimates. We find that the conditional average elasticities β̂ generated by the two methods have the same order of magnitude and follow analogous patterns. Based on this comparison, we do not find that GRFs overfit the data and produce an implausible large dispersion of predicted streaming effects. We also compare the distributions of estimated elasticities obtained in two different ways. We first use GRF estimates to split games into four groups that correspond to the quartiles of the estimated elasticities β̂ , and we re-estimate the streaming effect by using a 2SLS regression with all observations in each quartile group. To obtain the second set of analogous estimates, we estimate a GRF on 90% of randomly selected games and then predict streaming elasticities β̂ for the remaining 10% games. We then visualize the averages of these out-of-sample predictions for the same quartile groups as before, averaging across 10 iterations, each of which holds out a 10% subsample of games and estimates a GRF on the remaining data. Figure D.1 compares the two sets of estimates by reporting quartile averages and confidence intervals. We find that the out-of-sample elasticities predicted by GRF monotonically increase across the four groups, which are defined by in-sample GRF predicted elasticities. We also find that average out-of-sample elasticities in the four groups align with in-sample 2SLS elasticity estimates. These findings suggest that generalized random forests indeed recover meaningful heterogeneity in streaming effects across games. 40 Appendix Table D.1: Comparison of estimated elasticities β̂ j from median splits and the GRF. Median Splits Generalized Random Forests Estimate β̂ S.E. Average β̂ j Average S.E. Game age: New games (≤2.5 years old) 0.034 (0.004) 0.030 (0.015) Old games (>2.5 years old) 0.032 (0.008) 0.028 (0.008) Publisher size: Small publisher (1 game) 0.053 (0.011) 0.038 (0.016) Large publisher (2+ games) 0.018 (0.003) 0.021 (0.008) Price: Inexpensive (<$20) 0.046 (0.010) 0.037 (0.017) Expensive (≥$20) 0.023 (0.004) 0.021 (0.007) Quality: High quality (meta score >80) 0.041 (0.010) 0.024 (0.005) Low quality (meta score ≤80) 0.028 (0.004) 0.031 (0.014) Variance: Niche games (rating std. >2.4) 0.050 (0.010) 0.038 (0.009) Mainstream games (rating std. ≤2.4) 0.008 (0.002) 0.012 (0.006) D.4 Visualizing the GRF results In Figure 5 of Section 4.3, we visualize the estimated elasticities β̂ (X j) for two game attributes at a time, while holding all other attributes fixed at their average levels. To construct these graphs, we first generate a grid of values for each dimension in X j with a step equal to a 5% quartile of the ob- served unconditional distribution. We take the Cartesian product of these uni-dimensional grids to create a five-dimensional grid in the space of five attributes in X j. When visualizing estimated elas- ticities in Figure 5, we show the estimated elasticities β̂ (X) for each point on a two-dimensional grid, for the two selected game attributes, and we average the predicted β̂ ’s across all other di- mensions of X j. One can interpret these graphs as visualizing the conditional average treatment effects (CATE) for selected pairs of game attributes. One concern is that some combinations of attributes in these graphs are unrealistic and are never observed in the actual data. To this end, in the right panel of Figure 5 we additionally visualize the empirical density of game attributes, which helps to understand which areas of the estimated elasticities β̂ (X) rely on the extrapolation of GRF estimates. 41 − 0. 02 0. 00 0. 02 0. 04 0. 06 0. 08 0. 10 beta_j's quartile average out−of−sample beta_j linear estimate by quartile 1st quartile 2nd quartile 3rd quartile 4th quartile Appendix Figure D.1: Out-of-sample validation of GRF results. The graph splits games into four quartiles based on the estimated distribution of streaming elasticities β̂ . We compare two sets of estimates in each quartile. First, we estimate a GRF on 90% of randomly selected games and then predict streaming elasticities β̂ for the remaining 10% games. We then use the GRF estimates to predict elasticities out-of-sample and present the mean predicted elasticities by group. Second, we estimate linear models using 2SLS, separate by each quartile, and present the in-sample average elasticities. We fix the leaf size at 2×5471 (see Section D.2). 42 F Additional Figures and Tables Appendix Figure F.1: Many popular Twitch streamers have exuberant and memorable per- sonalities, which makes their content funny and entertaining. The figure shows screenshots from broadcasts of Twitch streamers Dr. Disrespekt (top left), Sydeon (top right), KayPea (bottom left), and Tyler1 (bottom right). 45 Appendix Figure F.2: When are games played and streamed? Appendix Figure F.3: When are games played and streamed? (continued) 46 A pp en di x Fi gu re F. 4: E xa ct st re am sc he du le sf or a ra nd om se to ft op st re am er s. 47

Documents

questions

Video Advertising by Twitch Influencers*, Lecture notes of Marketing

Related documents

Partial preview of the text