There's really just one criteria to apply, when judging whether or not an engineering environment has improved the life of engineers.
Does the environment do a better job of helping programmers create what they want to create? Their desire might be: a particular hand-crafted interface to a special set of functionality; or it might be a difficult-to-visualize network protocol; or it might be a simulation and visualization of something running within a machine, such as software or hardware.
But the criteria of success should be the same. Does the environment move programmers towards a situation, even a very domain-specific situation, where the manner in which they think about and evaluate success is now better served than the technical means to achieve technical ends? Are they now better able to "make the thing" as they conceive and feel it, with fewer implementation requirements?
If not, we are simply building environments out of technical necessity. We are not moving programmers towards a better future.
Some good examples can be found occasionally among domain-specific languages; others among special-purpose interfaces. When good, they are the result of a high sensitivity to the considerations and desired outcomes of people.
So, say one makes a precise statement about the desired behavior of a program, a repair, a new feature, a new program. This is of course an iterative process.
Let's imagine that the final effort devoted making the desiderata precise is D. That includes time interacting with the system that demands an exact operational description, and time to evaluate whether the now-visible result is the desired result.
Everything else is unrelated technical time, or U. This is not to denigrate it. Only to measure the impression of the effort needed to express D.
The ease (E) of using the programming language or environment for the implementation of that particular story or feature, can be characterized as a ration of two measurable impressions:
E = D / U
Obviously it takes a great deal of work, research, and sensitivity, to make U smaller.
But I rarely see a drop in U, in general-purpose programming languages or integrated development environments. These tools, whose job should be to ease the expression of thought, almost never improves the overall E of programmers. They're instead a bag of replacement techniques that contribute to U, with endless renaming of nearly-equivalent logical concepts, with endless unexamined consequential logical complications ... all with nearly no impact on E. Unfortunately, these new general-purpose environments and languages are tenaciously promoted -- and so another generation of programmers is trapped and unserved.
I don't think this situation is irresolvable. But it cannot be resolved by people who are unaware of this basic issue.
Clearly this is just a first-order definition of factors, not yet sufficient for experimental confirmation. For example, when constructing an experiment we would need to eliminate experience and facility with particular notational or symbolic conventions, as best as we can, and carefully evaluate what was left. We would also need to isolate motivational factors: for example, being sure that subjects are producing what they want to produce, not simply achieving some goal provided by the investigator. We need to, as best as we can account for, provide homologous situations, so that we can use concomitant variation, to adjust the major variables under investigation: the denotation of machine operations and the mind-internal semantic agreements.