Login | Register

Roger's Equations

This blog features weekly an equation, formula, or constant that occurs frequently in Engineering or Science. I will try to present the subject matter in a nonformal, conversational style that can be easily followed. Criticism and corrections are encouraged, as are suggestions for future discussions.

The Chemistry of DNA, Part 2

Posted February 11, 2009 5:14 PM by Roger Pink

Last time I discussed the chemical structure of DNA including the Phosphate-Sugar Backbone and the Nucleotides (Adenine, Thymine, Guanine, Cytosine). But you have all heard about DNA before, and even if you didn't know the particular names of the parts, you already knew that DNA is a double helix of complimentary base pairs (complementary nucleotides). So we all know what DNA is.

But how does DNA work? How does something so small determine what we look like? How fast we can run? Our blood type? I mean aside from the nebulous arm waving explanation that it "passes on genetic information".

To understand that, we need to further investigate the structure of DNA. Last time we saw that DNA is a double helix polymer with hydrogen bonded base pairs. But how many base pairs? How long is a human DNA?

Human DNA consists of about 3.17 billion base pairs. That's not one continuous polymer of DNA, rather it is broken up into 23 paired structures called chromosomes. So what is a chromosome?

Chromosome is an organized structure of DNA and proteins. Please take a look at the picture below. As you go from left to right, the scale becomes larger and larger with DNA farthest to the left being the key building block (along with proteins) of the chromosomes.

As you can see from above, it takes a lot of DNA to make a chromosome. The chromosomes are found in the cell nucleus. As mentioned earlier, there are 23 pairs of chromosomes in the nucleus of a cell. Here they are. The last pair are the sex chromosomes.

Every cell in the human body has these 46 chromosomes, except for sex cells which have only 23 chromosomes (unpaired). The average human body has approximately 50 trillion cells. Go ahead, let that sink in. I'll even repeat it:

3.17 billion unique base pairs of DNA per cell
50 trillion cells in the human body

So for each human walking around, thats over 2 billion miles of DNA walking around with them. Of course, there are many other animals, bacteria, plants, etc. out there. All have DNA. Lets see how long their DNA is, please note that the unit Mb means Millions of Base Pairs:

Oryza sativa (rice) genome = 441 Mb
Musa sp. (banana) genome = 873 Mb
Spinacia oleracea (spinach) genome = 989 Mb
Gallus gallus (chicken) genome = 1,200 Mb
Zea mays (corn) genome = 2,500 Mb
Homo sapiens (human) genome = 3,000 Mb
Nicotiana tabacum (tobacco) genome = 4,434 Mb
Vanilla planifolia (vanilla) genome = 7,672 Mb
Avena sativa (oat) genome = 11,315 Mb
Triticum aestivum (wheat) genome = 15,966 Mb
Triturus cristatus (crested newt) genome = 18,600 Mb
Necturus maculosus (mudpuppy) genome = 50,000 Mb
Lilium longiflorum (Easter lily) genome = 90,000 Mb
Fritillaria assyriaca (butterfly) genome = 124,900 Mb
Protopterus aethiopicus (lungfish) genome = 139,000 Mb

That's right, the lungfish genome is more than 40 times larger than the human genome. In fact, there are lots of animals and plants and even protozoa with larger DNA than us (C-Value is just a measure of DNA size):

Kind of makes you rethink your understanding of evolution, doesn't it? Well, don't panic, evolution is safe, it's just a bit more complicated than you were led to believe. Let's take a deep breath and go back to the begining to see if we can't make sense of what we are seeing.

Let's zoom back in on the chromosome again, way down to the DNA scale. See the picture below:

We've all heard the term "gene". It's what DNA is supposed to pass on. But what is a gene? and more importantly, how does it pass on traits to us? All that and more in my next blog entry.

Special thanks to the following websites:
http://en.wikipedia.org/wiki/Ploidy
http://en.wikipedia.org/wiki/Nucleosomes
http://en.wikipedia.org/wiki/Gene
http://dwb4.unl.edu/Chem/CHEM869N/CHEM869NLinks/chemistry.about.com/science/chemistry/library/weekly/aa061598a.htm
http://sandwalk.blogspot.com/2008/02/theme-genomes-junk-dna.html

Add a comment

The Chemistry of DNA

Posted August 25, 2008 10:01 AM by Roger Pink

Deoxyribonucleic Acid (DNA) consists of two polymers with backbones consisting of sugars and phosphate groups. These two polymers are connected to each other through hydrogen bonds between nucleobases attached to the sugar group on the backbone. Please see the diagram below:

There are four types of nucleobases found in DNA, they are Cytosine, Guanine, Adenine, and Thymine. The four bases are complimentary, meaning that when DNA is formed and the nucleobases hydrogen bond, Cytosine can only pair with Guanine and Adenine can only pair with Thymine. The diagram above is a bit vague, so lets view the molecular components of DNA one by one.

First the sugar group from the DNA backbone is Deoxyribose. Deoxyribose (actually 2-Deoxyribose) is basically a Ribose minus an oxygen (thus Deoxy). Technically it replaces the hydroxyl group (OH) with a hydrogen at the second carbon (thus 2-Deoxy). Please see image below:

The Phosphate group (see below) connects to the sugar group to form a polymer.

Finally there are the 4 nucleobases which connect to the sugar group and "stick out" so that they can hydrogen bond:

Put them together and you form single stranded DNA:

Now if you take two single stranded DNA and line them up with complimentary base pairs hydrogen bonding, you get double stranded DNA

You can see above that the reason that DNA nucleobases are complimentary has to do with the hydrogen bonding, like two puzzle pieces that fit together.

DNA is so large that pieces of it can be chemically interesting. One ofter hears about DNA sequences. These are the order in which the nucleobases are found in the DNA. Next time we will discuss DNA sequencing and the chemical origins of inherited traits.

http://www.albany.edu/~rp858838/

1 comments; last comment on 08/26/2008
View/add comments

Phi - The Golden Ratio Part II

Posted February 29, 2008 12:00 AM by Roger Pink

Last entry I derived Φ (pronounced "Fi" by some and "Fee" by others) by cutting a line in such a way that the ratio between the two new line segments created after the cut were the same as the ratio between the larger line segment created after the cut and the original line. I showed that when such constraints as define Φ above are imposed, there are two possible analytical solutions,

Φ=1.61803398874989.....
Φ= - 0.618033988749894....

I indicated that the first value was traditionally taken to be Φ. The second value, although a perfectly reasonable analytical solution, as a negative solution implies a negative length of one of the line segments involved (remember, Phi represents the ratio of lengths), which is not allowed.

So

Φ=1.61803398874989.....

Now what?

Properties of Phi

Recurrence Relation

The inverse of Φ is Φ - 1. (1/Φ = Φ-1)

The square of Φ is Φ + 1. (Φ2 = Φ + 1)

These two expressions are actually two examples of a more general property of Φ,



notice if n=1

Φ= Φ0 + Φ-1= 1 + 1/Φ, which can be rewritten as the first equation above,
1/Φ = Φ-1

notice if n=2

Φ2= Φ1 + Φ0= Φ + 1, which is the second equation above.

Continuing Fractions

Φ can be expressed as the continuing fractions:


Φ=

or


Fibonacci Series

The ratio of successive terms in the Fibonacci Series approaches Φ. The Fibonacci Series is:

F=0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, ...

and can be expressed in terms of Φ:


Noting that, as I said earlier, the ratio of respective terms of this series approaches Φ:

so 2/1=2, 3/2=1.5, 5/3=1.6666, 8/5=1.6, 13/8=1.625, 21/13=1.654, 34/21=1.619, etc.

It's possible to calculate powers of Φ with Fibonacci Series terms.

Φn=F(n-1) + F(n)Φ

where F(n) is the nth term of the Fibonacci series. For example:

Φ5=3+5Φ
or
Φ12=89+144Φ


Imaginary Numbers

A neat expression that involves Φ and i is,

Sin(i lnΦ)= (1/2) i

That's it for this entry. If there is a topic you are interested in feel free to email me and I'll try to get to it.

Thanks to the following sources:

http://en.wikipedia.org/wiki/Golden_ratio
http://goldennumber.net/five(5).htm
http://mathworld.wolfram.com/GoldenRatio.html

http://www.albany.edu/~rp858838

Add a comment

Phi - The Golden Ratio Part I

Posted January 10, 2008 12:00 AM by Roger Pink

Derivation of Phi

Phi , in the words of Euclid:

"A straight line is said to have been cut in extreme and mean ratio (ratio of Phi) when, as the whole line is to the greater segment, so is the greater to the less"

Hopefully a picture will help us visualize what Euclid is saying.

Basically Euclid is saying that if you were to cut the line segment A above at the point, line segments B and C are created. Then for the special case where the ratio of line segments are related such that A/B = B/C, the ratio is Φ (Phi). So lets solve the problem algebraically to find the value of Φ.

A/B = B/C

noting that A=B+C (see diagram above) and substituting we get

(B+C)/B = B/C

which can be rewritten as

B/B + C/B = B/C

1 + C/B = B/C

for simplicity, lets call the ratio B/C = Φ, which gives us

1 + 1/Φ =Φ

solving,

1/Φ = Φ - 1

1 = Φ2 - Φ

0 = Φ2 - Φ - 1

which gives roots

Φ= (1 ± √5)/2 (I used the quadratic equation to solve this with a=1, b=-1, c=-1)

The positive root is traditionally taken to be the value of Phi

Φ=1.61803398874989.....

though the negative root is just as valid a solution

Φ= - 0.618033988749894....

Here is a link that gives phi to 50,000 places

http://www.cs.arizona.edu/icon/oddsends/phi.htm

10 comments; last comment on 01/17/2008
View/add comments

The Three Doors Problem

Posted October 23, 2007 12:00 AM by Roger Pink

Imagine you were on a game show where you were presented with 3 doors. Behind one of the doors was a prize and behind the other two were goats (please note, you don't get to keep the goat, its a symbolic way of saying you've picked the wrong door).

You have no idea which door the prize is behind, but the game show host does. You are asked to pick a door and then the host opens one of the two doors that you didn't pick. The host never opens the door with the prize behind it, only a door with a goat behind it.

So now there are two doors left, the one you selected and the one that has not been opened. You are given the option to either open the door you've already selected or switch your choice and open the other door instead. Which door should you open?

A Formalism for Calculating Probabilities

First let's say the probability that A will occur given some prior information can be denoted P(A|I). Let's denote "the probability that A and B will occur given some prior information" by P(A,B|I).

At this point you may be asking what "Prior Information" means. In this context it means things you know, like for instance we all know that a coin has two sides and a fair coin has an equal chance of coming up heads or tails. In our notation we would right the probability of heads as:

P(Heads|Information)=P(H|I)=1/2

Similarly the probability of tails is:

P(Tails|Information)=P(T|I)=1/2

And the probability of flipping a heads and a tails (in succession):

P(H,T|I) = P(H|T,I) x P(T|I) = P(H|I) x P(T|I) = 1/2 x 1/2 = 1/4

Note in that last step we used the product rule of probability. Since the probability of getting heads doesn't depend on whether you got tails the previous throw, P(H|T,I) = P(H|I). Also keep in mind that:

P(H|I) + P(~H|I) = P(H|I) + P(T|I) = 1

Where ~H means "not heads". This is known as the sum rule. From the Product Rule and the Sum Rule we can derive a useful equation, consider:

P(X,Y|I) = P(Y,X|I) (this is true since A and B is the same as B and A)

and noting that:

P(Y,X|I) = P(Y|X,I) x P(X|I)

we get:

P(X,Y|I)=P(Y|X,I) x P(X|I)

we also know from the product rule:

P(X,Y|I) = P(X|Y,I) x P(Y|I)

if we combine the two equations we get:

P(Y|X,I) x P(X|I) = P(X|Y,I) x P(Y|I)

Solving for P(X|Y,I) we get:

P(X|Y,I) = [P(X|I) x P(Y|X,I)]/P(Y|I)

That final result is known as Bayes' Theorem and is a very useful equation for solving probability problems.

Back to the Three Doors Problem

So let's use Bayes' Theorem to calculate something simple first, just so you get a feel for it. We'll calculate the probability that we flip a coin and get heads given that we just flipped the coin a moment before and got tails plus our prior information that its a fair coin and we know how those are supposed to work. Using Bayes' Theorem we get:

P(H|T,I)= [P(H|I) x P(T|H,I)]/P(T|I) = [(1/2) x (1/2)] / (1/2) = 1/2

which makes sense since we should expect P(H|T,I) = P(H|I) = 1/2 since a fair coin's result doesn't depend on its previous result.

So let's apply it to the three doors problem. First define the probabilities of opening door 1, 2, or 3 (the first part of the game where you select a door).

P(Select Door 1|Information)= P(SD1|I)=1/3
P(SD1|I) = P(SD2|I) = P(SD3|I) = 1/3 (basically, you have equal odds for each door)

Now let's figure out the odds of the host opening a door given your selection and the location of the prize. For convenience, and since the problem is symmetric, lets pick door number 1 as our selected door.

Scenario 1, the door with the prize is door 1

P(Opened Door is 1|Selected Door is 1, Prize Door is 1, I) = P(OD1|SD1,PD2, I) = 0 (remember, the host won't open the door you picked or the door with the prize)

P(OD2|SD1,PD1,I) = 1/2 (the host can choose either of the doors you didn't pick)

P(OD3|SD1,PD1,I) = 1/2 (the host can choose either of the doors you didn't pick)

Scenario 2, the door with the prize is door 2

P(OD1|SD1,PD2, I) = 0 (remember, the host won't open the door you picked)

P(OD2|SD1,PD2,I) = 0 (the host won't open the door with the prize behind it)

P(OD3|SD1,PD2,I) = 1 (this is the only door in this scenario the host can open)

Scenario 3, the door with the prize is door 3

P(OD1|SD1,PD3, I) = 0 (remember, the host won't open the door you picked)

P(OD2|SD1,PD3,I) = 1 (this is the only door in this scenario the host can open)

P(OD3|SD1,PD3,I) = 0 (the host won't open the door with the prize behind it)

So let's use Bayes' Theorem to calculate the probability that the prize is behind door number 1 given that you selected door number 1 and the host opened door number 2.

P(PD1|SD1,OD2,I)= [P(PD1|I) x P(SD1,OD2|PD1,I)] / P(SD1,OD2|I)
= [P(PD1|I) x P(OD2,SD1|PD1,I)] / P(OD2,SD1|I)
= [P(PD1|I) x [P(OD2|SD1,PD1,I) x P(SD1|PD1,I)]] / [P(SD1|I) x P(OD2|SD1,I)]
= [ (1/3) x (1/2) x (1/3) ] / (1/3) x (1/2) = 1/3

Now lets calculate the probability that the prize is behind door number 3 given that you selected door number 1 and the host opened door number 2:

P(PD3|SD1,OD2,I)= [P(PD3|I) x P(SD1,OD2|PD3,I)] / P(SD1,OD2|I)
= [P(PD3|I) x P(OD2,SD1|PD3,I)] / P(OD2,SD1|I)
= [P(PD3|I) x [P(OD2|SD1,PD3,I) x P(SD1|PD3,I)]] / [P(SD1|I) x P(OD2|SD1,I)]
= [ (1/3) x (1) x (1/3) ] / (1/3) x (1/2) = 2/3

So there you have it, if you selected Door 1 and the host selected Door 2 then you have a 66% chance of winning if you switch your choice of doors and a 33% chance if you stick with your original choice of door.


I believe it, it makes sense

If that makes sense to you, great, you see how useful Bayes' Theorem is for calculating probabilities.

I'm having a tough time believing it, it doesn't make sense

If you have your doubts, that's understandable. You may mistakenly believe that when the host has eliminated the door that the odds have become 1/2 and 1/2, not 1/3 and 2/3. That's understandable, but it is wrong, the correct answer is 1/3 and 2/3.

Think about it this way, instead of thinking of it as the odds of each door having the prize, think of it as the odds of the door having or not having the prize. If there are three doors, the odds are:

Selecting Prize = 1/3
Not Selecting Prize = 2/3

Just because the host has eliminated one of the doors, those odds haven't changed, because you made the choice when there were three doors. Thus you're better off switching because you're sitting with the door with the 1/3 odds while the other has 2/3 odds.

What if, instead of 3 doors, there were 1000 doors? You select a door at random (1/1000) and then the host opens 998 doors leaving only yours and another door of his choosing not opened. Which door do you want, the one you picked at random, or the one the host left unopened? I'd take the one the host left unopened.

Why Use Bayes' Theorem?

The strength of Bayes' Theorem is you just identify your probabilities and calculate. You don't have to "figure it out". You can calculate and then reconcile yourself with the answer afterwards. It's a very useful tool, especially when the probabilities get complicated and our intuition fails us.

Ok, that's all for now. Special thanks to the following websites:

http://mathforum.org/dr.math/faq/faq.monty.hall.html
http://en.wikipedia.org/wiki/Monty_Hall_problem
http://en.wikipedia.org/wiki/Bayes'_theorem

Until next time.

Technorati Profile

22 comments; last comment on 03/25/2008
View/add comments


Previous in Blog: Multipole Expansion  
Show all Blog Entries in this Blog