The primary method for creating new variables in Stata is the `generate`

command. Load the `auto`

dataset.

```
clear
sysuse auto
describe
```

## New Variable from Existing Variables

Let’s create a new variable that is the sum of `weight`

and `length`

(ignore for the moment that summing weights and lengths doesn’t make a ton of sense). The syntax of `generate`

is:

```
generate nameOfNewVariable = whateverTheNewVariableIsEqualTo
```

So to create a new variable called `weightlength`

that is the sum of `weight`

and `length`

we type:

```
generate weightlength = weight+length
```

Now we have new variable called `weightlength`

.

Suppose now that we want to create a new variable that is the square of weight.

```
generate weight2 = weight^2
```

## New Variable that is a Constant

Suppose we want to create a new variable that is a constant value (this isn’t necessarily a good idea and you can use macros to store constants but using a variable can be pretty convenient too). Let’s make a new variable `x`

that is equal to 100.

```
generate x = 100
```

Let’s create a new variable that is equal to the mean of weight — we’ll call it `meanweight`

.

```
summarize weight
```

```
generate meanweight = 3019.459
```

You can also use the results of the `summarize`

command to create a mean.

```
summarize weight
generate meanweight = r(mean)
```

You can use the `_N`

operator to create a new variable that is equal to the number of observations in a dataset.

```
generate obs = _N
```

If you combine this with `by`

you can create a new variable that will be equal to the number of observations within the levels of the `by`

variable. For example, we can type:

```
by foreign: generate obs = _N
```

This will create a variable that is a constant within the levels of `foreign`

. That is, we are going to get the number of foreign cars and the number of domestic cars. If a line in the data is associated with foreign cars the new `obs`

variable will have a value of 22 and domestic cars will have a value of 52. Give it a try and see how it works.

## New Variable that is a Random Draw from a Distribution

We can create a new variable that is a random draw from a distribution. Let’s create a new variable whose values will be random draws from a normal distribution with a mean of 0 and a standard deviation of 1. The random normal generator command is `rnormal()`

(it defaults to a mean of 0 and standard deviation of 1 and it will draw as many values as there are observations in the dataset).

```
generate random = rnormal()
```

## Create a New Variable that Indexes the Observations

You can use the `_n`

operator to create a variable that indexes the observation number.

```
generate index = _n
```

This will create a new variable that runs from 1 to 74. You can combine this with `by`

to create an index within another variable.

```
by foreign: index = _n
```

This will create a new variable that runs from 1 to 52 for domestic cars and 1 to 22 for foreign cars.

## Conclusion

I’ve just touched on the ways you can create new variables. You can also use the `egen`

command to create new variables. Try new ways to create variables and be sure to read the help files.