Well, Mike Piazza has a slightly higher career batting average (2127 hits / 6911 at-bats = 0.308) than Hank Aaron (3771 hits / 12364 at-bats = 0.305). But can we say with confidence that his skill is actually higher, or is it possible he just got lucky a bit more often?
In this series of posts about an empirical Bayesian approach to batting statistics, we’ve been estimating batting averages by modeling them as a binomial distribution with a beta prior. But we’ve been looking at a single batter at a time. What if we want to compare two batters, give a probability that one is better than the other, and estimate by how much?
This is a topic rather relevant to my own work and to the data science field, because understanding the difference between two proportions is important in A/B testing. One of the most common examples of A/B testing is comparing clickthrough rates (“out of X impressions, there have been Y clicks”)- which on the surface is similar to our batting average estimation problem (“out of X at-bats, there have been Y hits””).1
Here, we’re going to look at an empirical Bayesian approach to comparing two batters.2 We’ll define the problem in terms of the difference between each batter’s posterior distribution, and look at four mathematical and computational strategies we can use to resolve this question. While we’re focusing on baseball here, remember that similar strategies apply to A/B testing, and indeed to many Bayesian models.
Dr. Mahdi Salehi, an associate member of SDAT and assistant professor of statistics at the University of Neyshabur, introduced a useful online interactive dashboard that visualize and follows confirmed cases of COVID-19 in real-time. The dashboard was publicly made available on 6 April 2020 to illustrate the counts of confirmed cases, deaths, and recoveries of COVID-19 at the level of country or continent. This dashboard is intended as a user-friendly dashboard for researchers as well as the general public to track the COVID-19 pandemic, and is generated from trusted data sources and built-in open-source R software (Shiny in particular); ensuring a high sense of transparency and reproducibility.
Access the shiny dashboard: https://mahdisalehi.shinyapps.io/Covid19Dashboard/
Scientific Data Analysis Team (SDAT) intends to organize the first event on the value of data to provide data holders and data analyzers with an opportunity to extract maximum value from their data. This event is organized by International Statistical Institute (ISI) and SDAT hosted at the Bu-Ali Sina University, Hamedan, Iran.
Organizers and the data providers will provide more information about the goals of the initial ideas, team arrangement, competition processes, and the benefits of attending this event on a webinar hosted at the ISI Gotowebianr system. Everyone invites to participate in this webinar for free, but it is needed to register at the webinar system by 30 December 2020.
Event Time: 31 December 2020 - 13:30-16:30 Central European Time (CET)
Register for the webinar: https://register.gotowebinar.com/register/8913834636664974352
More details about this event: http://sdat.ir/en/playdata
Aims and outputs:
• Playing with real data by explorative and predictive data analysis techniques
• A platform between a limited number of data providers and hundreds to thousands of data scientist Teams
• Improving creativity and scientific reasoning of data scientist and statisticians
• Finding the possible “bugs” with the current data analysis methods and new developments
• Learn different views about a dataset.
AWARD-WINNING:
The best-report awards consist of a cash prize:
$400 for first place,
$200 for second place, and
$100 for third place.
Important Dates:
Event Webinar: 31 December 2020 - 13:30-16:30 Central European Time (CET).
Team Arrangement: 01 Jan. 2021 - 07 Jan. 2021
Competition: 10 Jan. 2021 - 15 Jan. 2021
First Assessment Result: 25 Jan. 2021
Selected Teams Webinar: 30 Jan. 2021
Award Ceremony: 31 Jan. 2021
Please share this event with your colleagues, students, and data analyzers.
The Developement of Structural and Functional Neuroimaging Symposium hold at the School of Sciences, Shiraz University in April 17 2019. The Advanced fMRI Data Analysis Workshop also held in April 18-19 2019. For more information please visit: http://sdat.ir/dns98
The Rfssa package is available at CRAN. Dr. Hossein Haghbin and Dr. Seyed Morteza Najibi (SDAT Members) have published this package to provide the collections of necessary functions to implement Functional Singular Spectrum Analysis (FSSA) for analysing Functional Time Series (FTS). FSSA is a novel non-parametric method to perform decomposition and reconstruction of FTS. For more information please visit github homepage of package.
Symposium of Data Science Developement and its job opportunities hold at the Faculty of Science, Shiraz University in Feb 20 2019. For more information please visit: http://sdat.ir/dss97
SDAT is an abbreviation for Scientific Data Analysis Team. It consists of groups who are specialists in various fields of data sciences including Statistical Analytics, Business Analytics, Big Data Analytics and Health Analytics.
Address: No.15 13th West Street, North Sarrafan, Apt. No. 1 Saadat Abad- Tehran
Phone: +98-910-199-2800
Email: info@sdat.ir