Statistics
http://events.berkeley.edu/index.php/calendar/sn/stat.html
Upcoming EventsSeminar 217, Risk Management: Linking 10-K and the GICS - through Experiments of Text Classification and Clustering, Apr 16
http://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=122096&date=2019-04-16
A 10-K is an annual report filed by a publicly traded company about its financial performance and is required by the U.S. Securities and Exchange Commission (SEC). 10-Ks are fairly long and tend to be complicated. But this is one of the most comprehensive and most important documents a public company can publish on a yearly basis. The Global Industry Classification Standard (GICS) is an industry taxonomy developed in 1999 by MSCI and S&P Dow Jones Indices and is designed to classify a company according to its principal business activity. The GICS hierarchy begins with 11 sectors and is followed by 24 industry groups, 68 industries, and 157 sub-industries. We ask two questions: First, can a classifier be trained to recognize a firm's GICS sector based on the textual information in its 10-K? Second, can we extract, from the classifier, embeddings (low dimensional vectors) for 10-Ks that respect their GICS sectors, so firms within the same sector would have embeddings that are close (measured by cosine similarity)? We report on a series of experiments with Convolutional Neural Network (CNN) for text classification, trained on two variants of document representations, one uses pre-trained word vectors, the other is based on the simple bag-of-words model.http://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=122096&date=2019-04-16Conformal embedding and percolation on the uniform triangulation, Apr 17
http://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=125260&date=2019-04-17
Following Smirnov’s proof of Cardy’s formula and Schramm’s discovery of SLE, a thorough understanding of the scaling limit of critical percolation on the regular triangular lattice has been achieved. Smirnorv’s proof in fact gives a discrete approximation of the conformal embedding which we call the Cardy embedding. In this talk I will present a joint project with Nina Holden where we show that the uniform triangulation under the Cardy embedding converges to the Brownian disk under the conformal embedding. Moreover, we prove a quenched scaling limit result for the critical percolation on uniform triangulations. Time permitting, I will also explain how this result fits into the the larger picture of random planar maps and Liouville quantum gravity.http://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=125260&date=2019-04-17From correlation to causation — measuring ad effectiveness at scale, Apr 17
http://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=125235&date=2019-04-17
Everyone has had that one ad for that one pair of shoes seem to follow them everywhere they go on the internet. Why does that happen? Especially if you already bought the shoes? To make sense of this, it's worth understanding how marketers have historically measured ad effectiveness -- and why the problem is harder than it seems. Beyond improvements in measuring ad effectiveness, this talk with dive into the uniquely statistical problems we face in ad tech and some of the ways we are approaching them.http://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=125235&date=2019-04-17BIDS Data Science Lecture: Astrophysical Machine Learning, Apr 18
http://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=124964&date=2019-04-18
From streaming, repeated, noisy, and distorted images of the sky, time-domain astronomers are tasked with extracting novel science as quickly as possible with limited and imperfect information. Employing algorithms developed in other fields, we have has already reached important milestones demonstrating the speed and efficacy of using ML in data and inference workflows. Now we look to innovations in learning architectures and computational approaches that are purpose-built alongside the specific domain questions. I will describe such efforts—developed in the search for Planet 9, new classes of variable sources, and for data-driven emulators—and discuss on-going efforts to imbue physical understanding into the learning process itself.http://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=124964&date=2019-04-18Seminar 217, Risk Management: Private Company Valuations by Mutual Funds, Apr 23
http://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=122095&date=2019-04-23
We study how cross-sectional and temporal variation in valuation practice of private startup holdings by mutual funds affects investors’ access to these pre-IPO firms. Price dispersion across fund families holding the same security averages 10.0%, and is as large as 25% in some quarters. 42% of reported prices are not updated between quarters but large valuation changes occur when startups close a new funding round. Thus, follow-on rounds lead to predictably strong fund returns in the days after the event. Fund families tend to allocate private securities to their best performers and high-fee funds. Moreover, fund managers with incentives to boost periodic returns mark up their private securities more aggressively after the year-end follow-on rounds. We also find weak evidence of strategic return smoothing with lower incidence of markdowns of private securities in bear markets.http://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=122095&date=2019-04-23The topologies of random real algebraic hypersufaces, Apr 24
http://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=125369&date=2019-04-24
The topology of a hyper-surface in P^n(R) <br />
of high degree can be very complicated .However <br />
if we choose the surface at random there is a universal <br />
law . Little is known about this law and it appears <br />
to be dramatically different for n=2 and n>2 .<br />
There is a similar theory for zero sets of monochromatic <br />
waves which model nodal sets .<br />
Joint work with Y.Canzani and I.Wigmanhttp://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=125369&date=2019-04-24Cooperating with the Curse of Dimensionality, Apr 25
http://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=125386&date=2019-04-25
The curse of dimensionality arises when analyzing high-dimensional data and non-Euclidean data, such as network data, which are ubiquitous nowadays. It causes counter-intuitive phenomena and makes traditional statistical tools less effective or inapplicable. On the other hand, some counter-intuitive phenomena might be explained by some universal patterns, which could be used to form new effective tools in dealing with high-dimensional/non-Euclidean data. In this talk, one such unique pattern is explored and applied to fundamental statistical tasks, including hypothesis testing and cluster analysis, leading to substantial improvements in conducting these tasks for high-dimensional/non-Euclidean data. Some other related topics will also be briefly discussed.http://events.berkeley.edu/index.php/calendar/sn/stat.html?event_ID=125386&date=2019-04-25