Chrome's performance tools are the best way to look into issues like this. If you record and reload the page, you'll be able to gather all kinds of information on what the browser is doing that may provide vital clues. Here's a run:
In the Frames section, the red box indicates the dropped frame, so you can look just before that to see what is happening. The most concentrated section appears to be 4 different "Raster Threads".
Rastering is the process of converting image information into bitmap image data. Sure enough, you have 4 images that fade in as part of the animation. Thus the Raster process is waiting until the images are shown to change the .png image data into a bitmap representation to display on the screen.
All four of these images are only 75 KB total, so that seems small. But the reality is they are just highly optimized since they contain tons of alpha information. Each being 2,400 x 2,400 pixels in size, when rasterized they occupy 22 MB each of bitmap image data, or 88 MB total. This allocation and drawing takes a while (well, 25 milliseconds on my machine which is longer than a frame draw).
Chrome trying to be clever by deferring the rasterization until the image is first shown. Since with dynamic content the image may never be shown, it is able to potentially save memory, at the expense of doing this work later and perhaps causing a stutter. Safari takes a different tradeoff, and tends to rasterize when the image is in the DOM, regardless of opacity.
You could probably try tricking Chrome to rasterizing earlier by doing something like starting it at 1% opacity where it is imperceptible to humans, or covering up the background, but this is probably a bit silly.
Instead my recommendation would be to crop the images to their content so they are much smaller. They are nearly all blank space and don't need to be 2400x2400. This will be kinder on everyone's machines too, since you won't be eating up their RAM.
Otherwise you could also use text, though you'd need to use a webfont to display it all properly. This also has added benefits of accessibility. I get why images in this case may be easier since you've already got the assets though.