data transformation

(noun)

The application of a deterministic mathematical function to each point in a data set.

Related Terms

  • central limit theorem
  • confidence interval

Examples of data transformation in the following topics:

  • When to Use These Tests

    • "Ranking" refers to the data transformation in which numerical or ordinal values are replaced by their rank when the data are sorted.
    • In statistics, "ranking" refers to the data transformation in which numerical or ordinal values are replaced by their rank when the data are sorted.
    • Data transformation refers to the application of a deterministic mathematical function to each point in a data set—that is, each data point $z_i$ is replaced with the transformed value $y_i = f(z_i)$, where $f$ is a function.
    • Data can also be transformed to make it easier to visualize them.
    • Indicate why and how data transformation is performed and how this relates to ranked data.
  • Exercises

    • If the arithmetic mean of log10 transformed data were 3, what would be the geometric mean?
    • Using Tukey's ladder of transformation, transform the following data using a λof 0.5: 9, 16, 25
    • In the ADHD case study, transform the data in the placebo condition (D0) with λ's of .5, 0, -.5, and -1.
    • How does the skew in each of these compare to the skew in the raw data.
    • Which transformation leads to the least skew?
  • Transforming data (special topic)

    • When data are very strongly skewed, we sometimes transform them so they are easier to model.
    • A transformation is a rescaling of the data using a function.
    • Transformed data are sometimes easier to work with when applying statistical models because the transformed data are much less skewed and outliers are usually less extreme.
    • While there is a positive association in each plot, the transformed data show a steadier trend, which is easier to model than the untransformed data.
    • (b) A scatterplot of the same data but where each variable has been log-transformed.
  • Log Transformations

    • State how a log transformation can help make a relationship clear
    • The log transformation can be used to make highly skewed distributions less skewed.
    • The comparison of the means of log-transformed data is actually a comparison of geometric means.
    • Therefore, if the arithmetic means of two sets of log-transformed data are equal then the geometric means are equal.
    • Scatter plots of brain weight as a function of body weight in terms of both raw data (upper panel) and log-transformed data (lower panel).
  • Conclusion

    • We've described some of the basic "nuts and bolts" tools for entering and transforming network data.
    • The "bigger picture" is to think about network data (and any other, for that matter) as having "structure. " Once you begin to see data in this way, you can begin to better imagine the creative possibilities: for example, treating actor-by-attribute data as actor-by-actor, or treating it as attribute-by-attribute.
    • Different research problems may call for quite different ways of looking at, and transforming, the same data structures.
  • Box-Cox Transformations

    • Data that are normal lead to a straight line on the q-q plot.
    • Such data are often strongly skewed, as is clear from Figure 3.
    • The kernel density plot of the optimally transformed data is shown in the left frame of Figure 4.
    • (L) Density plot of the 1973 British income data.
    • (L) Density plot of the 1973 British income data transformed with λ = 0.21.
  • Linear Transformations

    • Often it is necessary to transform data from one measurement scale to another.
    • To transform feet to inches, you simply multiply by 12.
    • Similarly, to transform inches to feet, you divide by 12.
    • The transformation consists of multiplying by a constant and then adding a second constant.
    • Such transformations are therefore called linear transformations.
  • The Discrete Fourier Transform

    • Suppose we have discrete data, not a continuous function.
    • This is the discrete version of the Fourier transform (DFT).
    • $f_n$ are the data and $c_k$ are the harmonic coefficients of a trigonometric function that interpolates the data.
    • In the handout you will see some Mathematica code for computing and displaying discrete Fourier transforms.
    • The reason is that Mathematica uses a special algorithm called the FFT (Fast Fourier Transform).
  • Analyzing Data

    • Data Analysis is an important step in the Marketing Research process where data is organized, reviewed, verified, and interpreted.
    • Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making.
    • In statistical applications, some people divide data analysis into descriptive statistics, exploratory data analysis (EDA), and confirmatory data analysis (CDA).
    • All are varieties of data analysis.
    • Summarize the characteristics of data preparation and methodology of data analysis
  • Tukey Ladder of Powers

    • Plotting the data on a scatter diagram is the first step.
    • These data are plotted two ways in Figure 1.
    • The right frame displays the transformed data, together with the linear fit for the 1790-1960 period.
    • The demonstration in Figure 7 shows distributions of the data from the Stereograms case study as transformed with various values of λ.
    • Keep in mind that λ = 1 is the raw data.
Subjects
  • Accounting
  • Algebra
  • Art History
  • Biology
  • Business
  • Calculus
  • Chemistry
  • Communications
  • Economics
  • Finance
  • Management
  • Marketing
  • Microbiology
  • Physics
  • Physiology
  • Political Science
  • Psychology
  • Sociology
  • Statistics
  • U.S. History
  • World History
  • Writing

Except where noted, content and user contributions on this site are licensed under CC BY-SA 4.0 with attribution required.