GetComponent vs .transform vs _transform

After looking at this post on reddit, I've decided to make a few tests too see the performance difference between this 3 options, so this is what the post says:
 

I was a bit surprised when I was downvoted when I pointed out that .transform has no real cost and generate no garbage now.
I cannot spot which version of Unity implemented the fix, I only know that 3.5 doesn’t have it, and 4.3+ has it.
Exampe;
private void Update()
{
for (int i = 0; i < 10000; i++)
{
Vector3 v = transform.position;
}
}
This snippet generates no garbage, and 10,000 access to .transform takes 0.04 ms on my shitty work computer. (Tested on 4.5.4)
So, caching transform by yourself is now rather pointless.
— http://www.reddit.com/r/Unity3D/comments/2l9tso/transform_is_internally_cached/

So, I've decide to a few tests here. I'm using the same Profile method that I've used for the last tests, this one:
 

   1:   public static void Profile (string pDescription, int pInteractions, bool pRemoveBestAndWorst, Action pAction)
   2:      {
   3:          //If I want to remove the best and the worst interaction
   4:          if (pRemoveBestAndWorst)
   5:              pInteractions += 2;
   6:   
   7:          //Cleaning the Memory
   8:          GC.Collect();
   9:          GC.WaitForPendingFinalizers();
  10:          GC.Collect();
  11:   
  12:   
  13:   
  14:          //Warm Up the method
  15:          pAction();
  16:          List<TimeSpan> tAllTimes = new List<TimeSpan>();
  17:          for (int i = 0 ; i < pInteractions ; i++)
  18:          {
  19:              Stopwatch tStopwatch = new Stopwatch();
  20:              tStopwatch.Start();
  21:   
  22:              pAction();
  23:   
  24:              tStopwatch.Stop();
  25:              tAllTimes.Add(tStopwatch.Elapsed);
  26:          }
  27:          TimeSpan tAverageTime = GetAverageTime(tAllTimes, pRemoveBestAndWorst);
  28:   
  29:          if (pRemoveBestAndWorst)
  30:              pInteractions -= 2;
  31:   
  32:          Debug.Log("Average time of " + pDescription + " in " + pInteractions + " times is: " + tAverageTime);
  33:      }

And this is what I've used to make this tests:

   1:  //Using the Get Component
   2:  for (int i = 0; i < 1000; i++)
   3:  {
   4:      this.GetComponent<Transform>().position *= 0.01f;
   5:  }
   6:   
   7:  //Using the unity cached transform
   8:  for (int i = 0 ; i < 1000 ; i++)
   9:  {
  10:      this.transform.position *= 0.01f;
  11:  }
  12:   
  13:  //using my own cached transform
  14:  for (int i = 0 ; i < 1000 ; i++)
  15:  {
  16:      _transform.position *= 0.01f;
  17:  }

And here are the results

Using the ILSpy Disassembly to see the .transform Method, you can see that is a runtime call to get the current Method, that's why .transform is a bit slower than your own cached transform.

   1:  // UnityEngine.Component
   2:  public Transform transform
   3:  {
   4:      get
   5:      {
   6:          return this.InternalGetTransform();
   7:      }
   8:  }
   9:   
  10:  [WrapperlessIcall]
  11:  [MethodImpl(MethodImplOptions.InternalCall)]
  12:  internal extern Transform InternalGetTransform();

I know this is a little overkill and maybe don't make to much difference for most of the cases, but is good to know ;)