What Steven's experiment did was to compare learning by reading text with learning by listening. This is not to compare media, as such, but to compare sensory inputs to the subjects--visual with audio. (He could, for example, have compared two different media more objectively by offering the same text on screen and on paper.) That's OK as he chooses to define "media" in this way, but leads to further complications.
By using differing sensory inputs, he introduces extra variables. In the extreme, a blind person would learn better from audio, a deaf person by reading. Did he make any effort to establish the relative visual and auditory capabilities of the subjects?
Reading ability may be a factor. His subjects were undergraduates, whose reading ability would probably be well above average. Running the same experiment with a group of poor readers would have perhaps produced the opposite result. To support such a bald conclusion that "media influence learning" I would want to see a much more representative group of subjects drawn from the population at large, with a range of reading abilities.
Linguistic ability may be a factor. Assuming this research was carried out in English, those whose native language was not English might show different capabilities in reading and listening to something other than their native tongue. Even the accent of the speaker of the words may be a factor when listening to audio, not so when reading text.
Looking and listening are not necessarily independent. So, when looking at the screen, what were the subjects listening to? When listening to the audio, what were the subjects looking at?
Finally, how were the post-tests delivered? Print? Screen? Audio?
I suppose what I'm suggesting is that the experiment had only limited control of the variables inherent in the subject population.