GRASP

Reading between the (trend) lines - Spotting data manipulation in the wild

May 6th, 2026, by Amy Taylor Tag(s): Data, Data literacy

If you’ve spent any time working with research data, you’ve probably manipulated it. And even if you haven’t, you’ve likely read some research articles featuring data that has been manipulated. This is usually perfectly fine, essential even, but the word ‘manipulation’ carries some baggage, and for good reason. Data can be presented in some misleading ways and learning to spot the difference between legitimate data processing and research misconduct is a valuable skill.

What is data manipulation?

‘Data manipulation’ simply means making changes to data, but this umbrella term covers a wide spectrum of possibilities. On one end, you have legitimate processing: cleaning datasets, transforming variables, removing genuine outliers (with documented justification), or presenting data on a log scale to make patterns visible. Even cleaning up messy survey responses or adjusting the brightness of a microscopy image are considered data manipulation. Researchers do this all the time, and it is entirely appropriate as long as it’s transparent and reproducible.

On the other end of the spectrum sits selective omission, fabrication, and visual trickery. Whether careless or deliberate, these are the practices that misinterpret findings and are the reason the word ‘manipulation’ gets a bad rap. The key to picking out the difference is transparency. Legitimate data manipulation will be disclosed in the methods section, while problematic manipulation might be hidden in the fine print or won’t appear at all.

What to look for

Research misconduct in relation to data is taken very seriously by universities, by journals, and by funding bodies. The Australian Code for the Responsible Conduct of Research is clear that researchers must not fabricate, falsify, or misrepresent research data. Here are some potential red flags to look out for as you read an article.

Axes manipulation: One of the easiest ways to distort a graph is to manipulate the axes, and it’s surprisingly common even in high-quality publications. Watch out for truncated y-axes, inconsistent intervals, or dual axes with mismatched scales. Always check where the y-axis begins, and the spacing between tick marks.
Log scales: Logarithmic scales are genuinely useful. They’re essential for visualising data that spans several orders of magnitude, such as bacterial growth, earthquake magnitudes, or income distributions. But they’re also frequently misunderstood. On a log scale, equal distances represent equal ratios, not equal differences. A straight upward line on a log scale means exponential growth, which looks deceptively moderate compared to the same data on a linear scale. This matters enormously when interpreting the severity of a trend. If a paper uses a log scale, check whether the choice is justified and whether the axis is clearly labelled. If you’re comparing two papers on the same topic and one uses a log scale while the other doesn’t, be careful about directly comparing their visual conclusions.
Misleading data visualisation: Beyond axes, there are broader visualisation choices that can mislead including cherry-picked time windows, 3D charts, smoothing and interpolation, and omitting error bars. Look for whether the time period shown is justified, and check if the smoothing method is disclosed. And there is almost never a legitimate reason to use a 3D chart in academic research.
Common interpretation errors: Even when the data itself is presented honestly, interpretation errors in the text can be just as misleading. This might be in the form of misinterpreting p-values, hypothesising after results are known, overgeneralisation, or, the classic, confusing correlation with causation. Look at sample sizes, watch for papers that treat statistical significance as synonymous with practical importance, and always be alert to causal language like “leads to” or “causes” (And remember Nicholas Cage’s career probably doesn’t affect drowning numbers. Probably.)

What you can do

How do you avoid making these same mistakes in your own research? Data manipulation, the good kind, is a skill you’ll keep developing throughout your research career. The more transparent and principled your approach, the more confidently you can stand behind your findings. A few practical habits will help greatly in avoiding any possible accidental misconduct.

Document everything: Keep a research journal or data log that records every decision you make about your data, including what you changed, when, and why.
Work with raw data safely: Always keep an unaltered copy of your original dataset. Only make changes to working copies.
Talk to your supervisor: If you’re unsure whether a data decision is appropriate, ask. That’s what supervisors are for.

Learning to spot these issues takes practice, and it doesn’t mean you should approach the literature with blanket suspicion. Most researchers are working in good faith. But building your critical reading skills protects your own work, you won’t inadvertently cite a flawed finding as solid evidence, and you’ll be better equipped to present your own data with the transparency and rigour that good research demands.

Please make any anonymous comments/ feedback, or suggestions for further posts at this link. If you would like to get in touch, or write a post for the Ideas Hub blog, please email karen.miller@curtin.edu.au. We welcome contributions from HDR students!

Photo by Conny Schneider on Unsplash