David Aikman
/
Recent content on David AikmanHugo -- gohugo.ioen-usFri, 02 Sep 2022 00:00:00 +0000fun-stats: making a statistics library for JavaScript
/2022/09/02/fun-stats-making-a-statistics-library-for-javascript/
Fri, 02 Sep 2022 00:00:00 +0000/2022/09/02/fun-stats-making-a-statistics-library-for-javascript/I recently made a JavaScript library for fundamental statistics (fun-stats).
It was fun! I learned how to use rollup.js for bundling
code, and how to publish packages to npm. I also learned a lot about the mathematics behind the statistics I have been applying for years, which is a hole in my knowledge I have been wanting to fill for a while.
my package on npm
You can see the documentation for this package here.Making a portfolio website
/2022/06/30/making-a-portfolio-website/
Thu, 30 Jun 2022 00:00:00 +0000/2022/06/30/making-a-portfolio-website/Making my own portfolio/blog website
I made a portfolio website using React, it looks like this:
I wanted to have card-based content (i.e., grids of boxes) and the ability to navigate to different categories of portfolio items, rather than just list all items on the same page. I also wanted it to look fun, and have some kind of animation. You can’t see the animation in the screenshot, but I made the folders spin around and open and close.About Me
/about/
Tue, 20 Jul 2021 13:54:51 -0700/about/Hello! I’m David :)
Right now (August 2021) I’m Data Scientist at Public Health Scotland. I build infrastructure for the automation of the processing and publishing of data. I also lead the Open Data project.
I used to be a researcher at Emerald Works (which has been rebranded as “Mind Tools for Business”), where I did things like data analysis using, writing and designing interactive research reports using D3.js, and assisting with the development of a product (a benchmarking tool) with an agile team.Playing with D3
/2021/07/20/playing-with-d3/
Tue, 20 Jul 2021 00:00:00 +0000/2021/07/20/playing-with-d3/I wanted to try out using D3 in an R Markdown document, to see how it works. Using r2d3, it works really well!
Me and Mandy Norrbo were messing around trying to make some purely fun stuff in D3, which you don’t really see too much. And this is what I made. I would usually go for p5.js to make stuff like this, but it was a nice challenge to do it in D3 instead.Exploring Cross-Validation!
/2020/11/05/exploring-cross-validation/
Thu, 05 Nov 2020 00:00:00 +0000/2020/11/05/exploring-cross-validation/Here are some demonstrations of different cross-validation techniques. For a broad explanation of cross-validation, see the bottom of this post.
Simple Holdout Cross-Validation
Randomly splitting data into a training and testing set, once.
Repeated Cross-Validation
Leave-\(p\)-Out
This algorithm randomly selects \(p\) observations to exclude from the training set. These \(p\) observations constitute the testing set. This process is repeated until all possible combinations of \(p\) data-points have been used as a testing set.Bias and Noise
/2020/09/03/bias-and-noise/
Thu, 03 Sep 2020 00:00:00 +0000/2020/09/03/bias-and-noise/Imagine you’ve hit this target 20 times, with whatever weapon you’d like. Your hits are represented by the yellow dots. You were trying to hit the bullseye, but you might not have been completely accurate. There might have been some bias in your aim, which means you are accidentally aiming at a point other than the bullseye. And there might have been some noise. This means you do not hit where you are aiming 100% of the time.Invisible Women: NHS Condition Descriptions
/2020/06/23/invisible-women-nhs-web-scraping/
Tue, 23 Jun 2020 00:00:00 +0000/2020/06/23/invisible-women-nhs-web-scraping/After reading Invisible Women, I happened to come across this article on WebMD about menstrual pain that lists risk factors like this:
"The following circumstances may make a woman more likely to experience menstrual cramps:
She started her first period at an early age (younger than 11 years).
Her menstrual periods are heavy.
She is overweight or obese.
She smokes cigarettes or uses alcohol.
She has never been pregnant."Symptoms of depression as stochastic cellular automata
/2020/05/25/symptoms-of-depression-as-stochastic-cellular-automata/
Mon, 25 May 2020 00:00:00 +0000/2020/05/25/symptoms-of-depression-as-stochastic-cellular-automata/This is inspired by the paper ‘Mean field dynamics of stochastic cellular automata for random and small-world graphs’1, which goes beyond depression and stochastic cellular automata.
As opposed to typical ‘latent factor models’ of psychopathology (mental illness), whereby an illness like depression is caused by some underlying mechanism, independant from its symptoms, there are ‘network models’ that treat the symptoms of an illness as being the cause of each other (see Robinaugh et al.Belief Polarization
/2020/05/18/belief-polarization/
Mon, 18 May 2020 00:00:00 +0000/2020/05/18/belief-polarization/This is an animated simulation based on the paper ‘The polarization within and across individuals: the hierarchical Ising opinion model’.
(You probably can’t see it properly on a small screen.)
Each circle represents a person. The colour of the circle shows what that person currently believes. The more green or blue a circle, the more strongly the person aligns with the green or blue belief (respectively). The average belief of all people can be seen on the right-hand panel, as well as the recent history of the average belief.Hering Illusion
/2020/05/02/hering-illusion/
Sat, 02 May 2020 00:00:00 +0000/2020/05/02/hering-illusion/This is the Hering Illusion!
Click anywhere on the animation to take control (then click again to relinquish your iron grip).
The straight red lines should appear to be bent when they are close enough to the centre of the radial lines. If you wait, the red lines will change into a red square (not because of the illusion - I just programmed a square to replace them).
Learn more about the illusion here on the Illusions Index.Central Limit Theorem and Latent Variables
/2019/12/03/central-limit-theorem-and-latent-variables/
Tue, 03 Dec 2019 00:00:00 +0000/2019/12/03/central-limit-theorem-and-latent-variables/I have recently come across the concept of Central Limit Theorem (CLT), and wanted to visualise the phenomenon using an interactive shiny app.
Basically, CLT says that when you find the means (or sums, or other functions) of many variables, the resulting distribution is likely to be normal, independent of the distributions of each variable.
In the context of Psychometrics, CLT means that multi-item measures may produce normally distributed data when aggregated, regardless of the distribution of responses to each individual item.Correlation p-values
/2019/11/26/correlation-p-values/
Tue, 26 Nov 2019 00:00:00 +0000/2019/11/26/correlation-p-values/In Daniel Laken’s MOOC, Improving Your Statistical Inferences, he discusses the distribution of p-values you can expect from a t-test when there is a true effect, and when there is not.
If there is no true effect, all p-values are equally likely.
If there is a true effect, the probability of seeing a significant p-value is equivalent to the power of the test.
I decided to try some simulations for myself, but instead of using t-tests, I used Spearman’s Rank correlations.False Normality
/2019/11/14/false-normality/
Thu, 14 Nov 2019 00:00:00 +0000/2019/11/14/false-normality/A lot of statistical tests assume normal distribution of raw data or residuals from predicted model outcomes. One way to check for normality is visual, using histograms or Q-Q plots. Another way is statistical.
Of the statistical tests of normality, the Shapiro-Wilk test is the most powerful. See Razali and Wah for further discussion about power.
I’m more interested in the false-positive rate of the Shapiro-Wilk test than it’s power (the true-positive rate), as
if the test falsely identifies data from a non-normal population as normal, it would undermine further analyses, and the conclusions of a study.