Login | Register

Roger's Equations

This blog features weekly an equation, formula, or constant that occurs frequently in Engineering or Science. I will try to present the subject matter in a nonformal, conversational style that can be easily followed. Criticism and corrections are encouraged, as are suggestions for future discussions.

The Chemistry of DNA, Part 3

Posted July 07, 2009 12:00 AM by Roger Pink

This is the third part in a four part series on DNA. The earlier two parts can be found here: The Chemistry of DNA, Part 1 and Part 2.

Up until now we have only discussed the chemical structure of DNA (What it is) We have not, as of yet, discussed the chemical interactions of DNA (what it does). I think the best way to handle this is to work from the macroscopic and work our way back to DNA, since everyone is more familiar with the macroscopic. Therefore this post will detail briefly proteins and how they are used in the body. The fourth and final post will tie the proteins back to DNA.


The Stuff That We Are Made Of

The human body is made up of water, proteins, lipids (fats), carbohydrates, apatite (mineral found in bones that gives them there hardness), DNA and RNA, dissolved inorganic ions (sodium, potassium, etc.), and dissolved gases (oxygen, etc.). These materials make up the muscle, fat, bone, organs, connective tissue (tendons, ligaments), blood, lymph, urine, etc. that are us. It should also be mentioned that there are a large number of microorganism symbionts found in and on humans as well.

Most of the above is mostly water, since most of the above are made of cells and cells are mostly water (anywhere from 60% to 90%). Muscle, besides water, is made mostly of the proteins actin, nebulin, myosin, and titin; along with some carbohydrates like glucose. Fat (adipose tissue), besides water, is made mostly of lipids and some proteins. Bone, besides water, is made mostly of apatite, calcium, and the protein callogen. What Organs are made of depends on the organ, however most, besides being mostly water, are mostly proteins. Blood besides being mostly water are mostly Hemoglobin and other proteins along with some glucose (a carbohydrate). And so on and so on, we could go much longer but lets stop here and look back.

I hope you have noticed two things from the two paragraphs above. First, that we are mostly water. Second that proteins seem to be involved in everything in a major way. The truth is that everything important about life mostly comes from proteins so let's take a closer look at them.

I'm sure you've heard the term enzymes, right? They are biomolecules that catalyze chemical reactions in the body, sometimes making those reactions millions of times faster than they'd otherwise be. Almost all chemical reactions in all cells in all life need enzymes to occur. Guess what enzymes are.......that's right, they're almost always proteins.

I'm sure you've heard of hormones too, right? Things like insulin, leptin, etc. That's the stuff that alters certain cell's metabolisms, thus the way the body can regulate it's cellular parts. Basically how the body can coordinate between it's different groups of cells. These are usually modifications of amino acids (the building blocks of proteins), proteins, or lipids. Mostly proteins. Here's a list giving some detail if you're interested.

That stuff we generically call "tissue", besides the water, is mostly proteins. Skin is mostly proteins. Bone is mostly proteins. Muscle is mostly proteins. The chemicals in our body are mostly proteins.

Besides water, we are mostly proteins.

The Spice of Life

So now that we know that proteins are very important to humans (and life in general), let's learn some things about them.

The first thing to know about proteins in the human body (and all life), is there is a lot of them. How many you ask? Well, so many that we don't know how many there are. Some estimates say 10s of thousands distinct different types of proteins in the human body, others say 100s of thousands. In fact, one of the fastest growing fields in biology today, called Proteomics, is simply the identification and characterization of proteins in the human body. The project dwarfs the Human Genome project with several databases holding the structures of thousands of proteins such as the Protein Data Bank, or the Protein Information Resource.

The structure of proteins are as diverse as they are complicated. Proteins are polymer chains made up of amino acids (actually proteins are made up of peptides which are made up of amino acids, we'll get to that later). Those chains can vary in length from a few amino acids to the 34,350 amino acids of Titin (C132983H211861N36149O40883S693.) Yikes! They tend to look like tangled ropes. Here are a few examples:

The molecule on the left is Insulin and the molecule on the right is Hemoglobin, both proteins.

Amino Acids?

Alright, we're getting close now. We know there are a huge variety of proteins found in human beings (actually in all life). The question now is why? Where are all these proteins coming from?

To answer that question we need to understand what a Protein is. A protein is a polypeptide chain, or sometimes crosslinked polypeptide chains. A polypeptide chain is a polymer made out of something called an amino acid. There are 20 amino acids used to make the proteins found in us. They are:

Alanine, Arginine, Asparagine, Aspartic acid, Cysteine, Glutamic acid, Glutamine, Glycine, Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Proline, Serine, Threonine, Tryptophan, Tyrosine, and Valine.

Remember, that's 10's of thousands or perhaps 100's of thousands of distinct proteins that are all built by putting the above 20 amino acids together in different combinations. To give you an idea of what these amino acids look like, here are some amino acid structures, both with chemical symbol diagram and then with ball and stick diagram (so you can get an idea of it's 3D shape).

Alanine

Histidine

In general amino acids have the following chemical structure:

On the left side of the diagram above, the nitrogen attached to two hydrogens is called the amino group. On the right side of the diagram above, the carbon along with the two oxygens attached two it and the hydrogen attached to one of those oxygens is called the carboxyl group. Carboxyl groups are acidic in nature, thus the term "Amino (Amino group) Acid (Caboxyl group)". The R in the diagram above, which stands for Radical, is a place holder for what can be an atom or atoms, and it's what makes amino acids different from one another. For instance take another look at the amino acid Alanine:

You can see that the radical (R) in the amino acid alanine above is CH3 (Methyl)

Now lets look at the amino acid Histidine again:

You can see that the radical (R) above in the amino acid Histidine is C3H4N2 (Imidazole).

Here's one more amino acid, Tryptophan:

You can see that the radical (R) above in the amino acid Tryptophan is C8H7N (Indole)*.

So those are the Amino Acids. Now, as I said earlier, Proteins are made of peptides, which themselves are polymer chains of amino acids (keep in mind that a protein can be made of just one peptide as well). If you remember, amino acids have an amino group and an carboxyl group, well these can bond to each other, resulting in a continuous chain of amino acids. To see how this bonding occurs, look at the diagram below:

Once the peptide bond is formed, the amino acids are linked, and a dipeptide has formed (di meaning two). Notice that the dipeptide bond above still has a carboxyl group on one end and an amino group on the other. Clearly you can add more amino acids, and thus polypeptides are made (poly meaning many). See for instance the polypeptide below:

These polypeptides can be proteins themselves, or sometimes it's several polypeptides crosslinked together that make the protein as in the protein Callogen:

In the diagram above, a Callogen protein is displayed on the left and right. The right side shows that the protein itself consists of three crosslinked polypeptide chains (each a different color).

The point is that amino acids strung together into polymers called polypeptides are the building blocks of proteins. Since there are 20 different types of amino acids, even a dipeptide (two amino acids linked) molecule has 202=400 possible structures! A tripeptide (three amino acids linked) molecule has 203=8000 possible structures! Remember the protein Titin I told you about? That had 34,350 Amino Acids! The fact that the body has perhaps 100s of thousands of proteins isn't remarkable because it's a lot of proteins, in a way, given all the possible combinations of the available 20 amino acids, it's remarkable that there are so few. Of course, the body doesn't just make so many proteins for varieties sake. They have functions (at least we think they all do, most probably do).

The ordering of the amino acids in the protein are what ultimately determine the function of the proteins as they are what determine the shape and chemistry of the proteins. The variety of properties a protein can have are almost overwhelming, mainly because of the diversity offered by its constituent amino acid building blocks.

In my next and final post on the chemistry of DNA, we will be getting back to DNA and explaining how it relates to the stuff regarding proteins we just discussed. However, if you are interested in learning more about Proteins, you may consider the following videos. If you're really ambitious you can try the last one (it gets into the thick of it).

Here are some excellent followup videos regarding proteins. (Short Summary Video on Protein Structure, Protein Folding Video , Detailed Lecture on Protein Formation (Long))
|
Special thanks to wikipedia, my favorite website on the web.

*I originally labeled this functional group incorrectly. A big thanks to Jmueller for catching the error and letting me know.

5 comments; last comment on 07/19/2009
View/add comments

The Chemistry of DNA, Part 2

Posted February 11, 2009 5:14 PM by Roger Pink

Last time I discussed the chemical structure of DNA including the Phosphate-Sugar Backbone and the Nucleotides (Adenine, Thymine, Guanine, Cytosine). But you have all heard about DNA before, and even if you didn't know the particular names of the parts, you already knew that DNA is a double helix of complimentary base pairs (complementary nucleotides). So we all know what DNA is.

But how does DNA work? How does something so small determine what we look like? How fast we can run? Our blood type? I mean aside from the nebulous arm waving explanation that it "passes on genetic information".

To understand that, we need to further investigate the structure of DNA. Last time we saw that DNA is a double helix polymer with hydrogen bonded base pairs. But how many base pairs? How long is a human DNA?

Human DNA consists of about 3.17 billion base pairs. That's not one continuous polymer of DNA, rather it is broken up into 23 paired structures called chromosomes. So what is a chromosome?

Chromosome is an organized structure of DNA and proteins. Please take a look at the picture below. As you go from left to right, the scale becomes larger and larger with DNA farthest to the left being the key building block (along with proteins) of the chromosomes.

As you can see from above, it takes a lot of DNA to make a chromosome. The chromosomes are found in the cell nucleus. As mentioned earlier, there are 23 pairs of chromosomes in the nucleus of a cell. Here they are. The last pair are the sex chromosomes.

Every cell in the human body has these 46 chromosomes, except for sex cells which have only 23 chromosomes (unpaired). The average human body has approximately 50 trillion cells. Go ahead, let that sink in. I'll even repeat it:

3.17 billion unique base pairs of DNA per cell
50 trillion cells in the human body

So for each human walking around, thats over 2 billion miles of DNA walking around with them. Of course, there are many other animals, bacteria, plants, etc. out there. All have DNA. Lets see how long their DNA is, please note that the unit Mb means Millions of Base Pairs:

Oryza sativa (rice) genome = 441 Mb
Musa sp. (banana) genome = 873 Mb
Spinacia oleracea (spinach) genome = 989 Mb
Gallus gallus (chicken) genome = 1,200 Mb
Zea mays (corn) genome = 2,500 Mb
Homo sapiens (human) genome = 3,000 Mb
Nicotiana tabacum (tobacco) genome = 4,434 Mb
Vanilla planifolia (vanilla) genome = 7,672 Mb
Avena sativa (oat) genome = 11,315 Mb
Triticum aestivum (wheat) genome = 15,966 Mb
Triturus cristatus (crested newt) genome = 18,600 Mb
Necturus maculosus (mudpuppy) genome = 50,000 Mb
Lilium longiflorum (Easter lily) genome = 90,000 Mb
Fritillaria assyriaca (butterfly) genome = 124,900 Mb
Protopterus aethiopicus (lungfish) genome = 139,000 Mb

That's right, the lungfish genome is more than 40 times larger than the human genome. In fact, there are lots of animals and plants and even protozoa with larger DNA than us (C-Value is just a measure of DNA size):

Kind of makes you rethink your understanding of evolution, doesn't it? Well, don't panic, evolution is safe, it's just a bit more complicated than you were led to believe. Let's take a deep breath and go back to the begining to see if we can't make sense of what we are seeing.

Let's zoom back in on the chromosome again, way down to the DNA scale. See the picture below:

We've all heard the term "gene". It's what DNA is supposed to pass on. But what is a gene? and more importantly, how does it pass on traits to us? All that and more in my next blog entry.

Special thanks to the following websites:
http://en.wikipedia.org/wiki/Ploidy
http://en.wikipedia.org/wiki/Nucleosomes
http://en.wikipedia.org/wiki/Gene
http://dwb4.unl.edu/Chem/CHEM869N/CHEM869NLinks/chemistry.about.com/science/chemistry/library/weekly/aa061598a.htm
http://sandwalk.blogspot.com/2008/02/theme-genomes-junk-dna.html

Add a comment

The Chemistry of DNA

Posted August 25, 2008 10:01 AM by Roger Pink

Deoxyribonucleic Acid (DNA) consists of two polymers with backbones consisting of sugars and phosphate groups. These two polymers are connected to each other through hydrogen bonds between nucleobases attached to the sugar group on the backbone. Please see the diagram below:

There are four types of nucleobases found in DNA, they are Cytosine, Guanine, Adenine, and Thymine. The four bases are complimentary, meaning that when DNA is formed and the nucleobases hydrogen bond, Cytosine can only pair with Guanine and Adenine can only pair with Thymine. The diagram above is a bit vague, so lets view the molecular components of DNA one by one.

First the sugar group from the DNA backbone is Deoxyribose. Deoxyribose (actually 2-Deoxyribose) is basically a Ribose minus an oxygen (thus Deoxy). Technically it replaces the hydroxyl group (OH) with a hydrogen at the second carbon (thus 2-Deoxy). Please see image below:

The Phosphate group (see below) connects to the sugar group to form a polymer.

Finally there are the 4 nucleobases which connect to the sugar group and "stick out" so that they can hydrogen bond:

Put them together and you form single stranded DNA:

Now if you take two single stranded DNA and line them up with complimentary base pairs hydrogen bonding, you get double stranded DNA

You can see above that the reason that DNA nucleobases are complimentary has to do with the hydrogen bonding, like two puzzle pieces that fit together.

DNA is so large that pieces of it can be chemically interesting. One ofter hears about DNA sequences. These are the order in which the nucleobases are found in the DNA. Next time we will discuss DNA sequencing and the chemical origins of inherited traits.

http://www.albany.edu/~rp858838/

1 comments; last comment on 08/26/2008
View/add comments

Phi - The Golden Ratio Part II

Posted February 29, 2008 12:00 AM by Roger Pink

Last entry I derived Φ (pronounced "Fi" by some and "Fee" by others) by cutting a line in such a way that the ratio between the two new line segments created after the cut were the same as the ratio between the larger line segment created after the cut and the original line. I showed that when such constraints as define Φ above are imposed, there are two possible analytical solutions,

Φ=1.61803398874989.....
Φ= - 0.618033988749894....

I indicated that the first value was traditionally taken to be Φ. The second value, although a perfectly reasonable analytical solution, as a negative solution implies a negative length of one of the line segments involved (remember, Phi represents the ratio of lengths), which is not allowed.

So

Φ=1.61803398874989.....

Now what?

Properties of Phi

Recurrence Relation

The inverse of Φ is Φ - 1. (1/Φ = Φ-1)

The square of Φ is Φ + 1. (Φ2 = Φ + 1)

These two expressions are actually two examples of a more general property of Φ,



notice if n=1

Φ= Φ0 + Φ-1= 1 + 1/Φ, which can be rewritten as the first equation above,
1/Φ = Φ-1

notice if n=2

Φ2= Φ1 + Φ0= Φ + 1, which is the second equation above.

Continuing Fractions

Φ can be expressed as the continuing fractions:


Φ=

or


Fibonacci Series

The ratio of successive terms in the Fibonacci Series approaches Φ. The Fibonacci Series is:

F=0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, ...

and can be expressed in terms of Φ:


Noting that, as I said earlier, the ratio of respective terms of this series approaches Φ:

so 2/1=2, 3/2=1.5, 5/3=1.6666, 8/5=1.6, 13/8=1.625, 21/13=1.654, 34/21=1.619, etc.

It's possible to calculate powers of Φ with Fibonacci Series terms.

Φn=F(n-1) + F(n)Φ

where F(n) is the nth term of the Fibonacci series. For example:

Φ5=3+5Φ
or
Φ12=89+144Φ


Imaginary Numbers

A neat expression that involves Φ and i is,

Sin(i lnΦ)= (1/2) i

That's it for this entry. If there is a topic you are interested in feel free to email me and I'll try to get to it.

Thanks to the following sources:

http://en.wikipedia.org/wiki/Golden_ratio
http://goldennumber.net/five(5).htm
http://mathworld.wolfram.com/GoldenRatio.html

http://www.albany.edu/~rp858838

Add a comment

Phi - The Golden Ratio Part I

Posted January 10, 2008 12:00 AM by Roger Pink

Derivation of Phi

Phi , in the words of Euclid:

"A straight line is said to have been cut in extreme and mean ratio (ratio of Phi) when, as the whole line is to the greater segment, so is the greater to the less"

Hopefully a picture will help us visualize what Euclid is saying.

Basically Euclid is saying that if you were to cut the line segment A above at the point, line segments B and C are created. Then for the special case where the ratio of line segments are related such that A/B = B/C, the ratio is Φ (Phi). So lets solve the problem algebraically to find the value of Φ.

A/B = B/C

noting that A=B+C (see diagram above) and substituting we get

(B+C)/B = B/C

which can be rewritten as

B/B + C/B = B/C

1 + C/B = B/C

for simplicity, lets call the ratio B/C = Φ, which gives us

1 + 1/Φ =Φ

solving,

1/Φ = Φ - 1

1 = Φ2 - Φ

0 = Φ2 - Φ - 1

which gives roots

Φ= (1 ± √5)/2 (I used the quadratic equation to solve this with a=1, b=-1, c=-1)

The positive root is traditionally taken to be the value of Phi

Φ=1.61803398874989.....

though the negative root is just as valid a solution

Φ= - 0.618033988749894....

Here is a link that gives phi to 50,000 places

http://www.cs.arizona.edu/icon/oddsends/phi.htm

10 comments; last comment on 01/17/2008
View/add comments


Previous in Blog: The Three Doors Problem  
Show all Blog Entries in this Blog