checkpoint, don't expect this to read right yet. :)

llvm-svn: 115426
2010-10-02 21:59:30 +00:00 · 2010-10-02 21:59:30 +00:00 · bf1cf670a6
parent 4e3918c06b
commit bf1cf670a6
1 changed files with 115 additions and 96 deletions
--- a/llvm/docs/ReleaseNotes.html
+++ b/llvm/docs/ReleaseNotes.html
@ -67,7 +67,6 @@ current one.  To see the release notes for a specific release, please see the
 Almost dead code.
  include/llvm/Analysis/LiveValues.h => Dan
  lib/Transforms/IPO/MergeFunctions.cpp => consider for 2.8.
-  llvm/Analysis/PointerTracking.h => Edwin wants this, consider for 2.8.
  GEPSplitterPass
 -->
 
@ -82,79 +81,6 @@ Almost dead code.
 
 <!-- Announcement, lldb, libc++ -->
 
- <!-- to write:
-  MachineCSE tuned and on by default.
-  llvm.dbg.value: variable debug info for optimized code
-  MC Assembler backend is now real, does relaxation and is bitwise identical
-    with darwin assembler in huge majority of all cases.
-  new GHC calling convention
-  New half float intrinsics LangRef.html#int_fp16
-  Rewrote tblgen's type inference for backends to be more consistent and
-     diagnose more target bugs.  This also allows limited support for writing
-     patterns for instructions that return multiple results, e.g. a virtual
-     register and a flag result.  Stuff that used 'parallel' before should use
-     this.
-  New ARM/Thumb disassembler support in MC.
-  New SSEDomainFix pass: 
-    On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a
-    register in a different domain than where it was defined. Some instructions
-    have equvivalents for different domains, like por/orps/orpd.  The
-    SSEDomainFix pass tries to minimize the number of domain crossings by
-    changing between equvivalent opcodes where possible.
-  Support for the Intel AES instructions in the assembler.
-  memcpy, memmove, and memset now take address space qualified pointers + volatile.
-  per-instruction debug info metadata is much faster and uses less space (new DebugLoc class).
-  -ffunction-sections and -fdata-sections are supported on ELF targets.
-  Now iterate function passes when a cgsccpassmanager detects a devirtualization
-  -momit-leaf-frame-pointer now supported.
-  New -regalloc=fast,  =local got removed
-  New -regalloc=default option that chooses a register allocator based on the -O optimization level.
-  New "trap values" concept: http://llvm.org/docs/LangRef.html#trapvalues
-  Improved trip count analysis for <= and >= loops, and uses sign overflow info.
-  REMOVED: SCCVN pass.
-  X86 backend attempts to promote 16-bit integer operations to 32-bits to avoid
-     0x66 prefixes, which are slow on some microarchitectures and bloat the code
-     on others.
-  X87 fp stackifier is global!
-  LTO debug info support?
-  NEON: Better performance for QQQQ (4-consecutive Q register) instructions.  New reg sequence abstraction?
-  New support for X86 "thiscall" calling convention (x86_thiscallcc in IR).
-  ARM: Better scheduling (list-hybrid, hybrid?)
-  New SubRegIndex tblgen class for targets -> jakob
-  ARM: Tail call support.
-  AVX support in the MC assembler.  Full compiler support not done yet.
-  Atomics now get legalized when not natively supported (jim g)
-  ARM: General performance work and tuning.
-  Bottom up fast isel.  Simple Load reuse.  No more machinedce.  Load folding at -O0?
-  New linker_private_weak and linker_private_weak_def_auto linkage types
-  compiler_rt softfloat support.
-  X86 ABI:  <2 x float> in IR no longer maps onto MMX, it turns into <4 x float>
-  IR ABI: <3 x float> is passed as <4 x float> instead of 3 floats.
-  renamed "Release" -> "Release+Asserts"; "Release-Asserts" -> "Release etc.
-  New COPY instruction. copyRegToReg -> copyPhysReg, isMoveInstr is gone.
-  JumpThreading much more aggressive about implied value relations.
-  New RegionInfo pass  "opt -regions analyze" or "opt -view-regions".
-  mc assembler supports macros.
-  RenderMachineFunction: -rendermf
-  SplitKit?
-  Evan: Teach bottom up pre-ra scheduler to track register pressure. Work in progress.
-  Evan: Add an ILP scheduler.  On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%.
-  RegisterPass<> -> INTIALIZE_PASS()
-  llvm-diff?
-  Preliminary work on TBAA but not usable in 2.8.
-  Atomic lowering patch: -loweratomic (see Passes.html#loweratomic)
-  compiler_rt now includes extensive a fairly testsuite for blocks language feature and the blocks runtime.
-  New OptimizeExts+OptimizeCmps -> PeepholeOptimizer pass
-  Triples are now stored in normalized form.  Triple::normalize.
-  New LocalStackSlotAllocation.cpp pass (jimg)
-  New llvm.x86.int intrinsic (for int $42 and int3)
-  New CorrelatedValuePropagation pass, not on by default in 2.8 yet.
-  Verbose assembly decodes X86 shuffle instructions, e.g.:
-  	insertps	$113, %xmm3, %xmm0     ## xmm0 = zero,xmm0[1,2],xmm3[1]
-	unpcklps	%xmm1, %xmm0    ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
-	pshufd	$1, %xmm1, %xmm1        ## xmm1 = xmm1[1,0,0,0]
- -->
- 

 <!-- *********************************************************************** -->
 <div class="doc_section">
@ -253,10 +179,10 @@ libgcc routines).</p>

 <p>
 All of the code in the compiler-rt project is available under the standard LLVM
-License, a "BSD-style" license.  New in LLVM 2.8: 
-
-Soft float support
-</p>
+License, a "BSD-style" license.  New in LLVM 2.8, compiler_rt now supports 
+soft floating point (for targets that don't have a real floating point unit),
+and includes an extensive testsuite for the "blocks" language feature and the
+blocks runtime included in compiler_rt.</p>

 </div>

@ -526,10 +452,6 @@ organization changes have happened:
 <p>LLVM 2.8 includes several major new capabilities:</p>

 <ul>
-<li>atomic lowering pass.</li>
-<li>RegionInfo pass: opt -regions analyze" or "opt -view-regions". 
-<!-- Tobias Grosser --></li>
-<li>ARMGlobalMerge: <!-- Anton --> </li>
 <li>llvm-diff</li>
 </ul>

@ -546,6 +468,13 @@ expose new optimization opportunities:</p>

 <ul>

+  memcpy, memmove, and memset now take address space qualified pointers + volatile.
+  per-instruction debug info metadata is much faster and uses less space (new DebugLoc class).
+  New "trap values" concept: http://llvm.org/docs/LangRef.html#trapvalues
+  New linker_private_weak and linker_private_weak_def_auto linkage types
+  Triples are now stored in normalized form.  Triple::normalize.
+
+
 <li>LLVM 2.8 changes the internal order of operands in <a
  href="http://llvm.org/doxygen/classllvm_1_1InvokeInst.html"><tt>InvokeInst</tt></a>
  and <a href="http://llvm.org/doxygen/classllvm_1_1CallInst.html"><tt>CallInst</tt></a>.
@ -612,6 +541,14 @@ release includes a few major enhancements and additions to the optimizers:</p>
 <ul>

 <li></li>
+  Preliminary work on TBAA but not usable in 2.8.
+  New CorrelatedValuePropagation pass, not on by default in 2.8 yet.
+  JumpThreading much more aggressive about implied value relations.
+  New RegionInfo pass  "opt -regions analyze" or "opt -view-regions".
+  Improved trip count analysis for <= and >= loops, and uses sign overflow info.
+  llvm.dbg.value: variable debug info for optimized code
+  Now iterate function passes when a cgsccpassmanager detects a devirtualization
+  Atomic lowering patch: -loweratomic (see Passes.html#loweratomic)

 </ul>

@ -639,22 +576,38 @@ release includes a few major enhancements and additions to the optimizers:</p>

 <div class="doc_text">
 <p>
-FIXME: Rewrite.
-
-The LLVM Machine Code (aka MC) sub-project of LLVM was created to solve a number
+The LLVM Machine Code (aka MC) subsystem was created to solve a number
 of problems in the realm of assembly, disassembly, object file format handling,
 and a number of other related areas that CPU instruction-set level tools work
-in. It is a sub-project of LLVM which provides it with a number of advantages
-over other compilers that do not have tightly integrated assembly-level tools.
-For a gentle introduction, please see the <a
+in.</p>
+
+<p>The MC subproject has made great leaps in LLVM 2.8.  For example, support for
+   directly writing .o files from LLC (and clang) now works reliably for
+   darwin/x86[-64] (including inline assembly support) and the integrated
+   assembler is turned on by default in Clang for these targets.  This provides
+   improved compile times among other things.</p>
+
+<ul>
+<li>The entire compiler has converted over to using the MCStreamer assembler API
+    instead of writing out a .s file textually.</li>
+<li>The "assembler parser" is far more mature than in 2.7, supporting a full
+    complement of directives, now supports assembler macros, etc.</li>
+<li>The "assembler backend" has been completed, including support for relaxation
+    relocation processing and all the other things that an assembler does.</li>
+<li>The MachO file format support is now fully functional and works.</li>
+<li>The MC disassembler now fully supports ARM and Thumb.  ARM assembler support
+    is still in early development though.</li>
+<li>The X86 MC assembler now supports the X86 AES and AVX instruction set.</li>
+<li>Work on ELF and COFF support is well underway, but isn't useful yet in LLVM
+    2.8.  Please contact the llvmdev mailing list if you're interested in
+    this.</li>
+</ul>
+
+<p>For more information, please see the <a
 href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro to the
 LLVM MC Project Blog Post</a>.
 </p>

-<p>2.8 status here.  Basic correctness, some obscure missing instructions on
-   mainline, on by default in clang.
-   Entire compiler backend converted to use mcstreamer.
-   </p>
 </div>	


@ -671,7 +624,36 @@ infrastructure, which allows us to implement more aggressive algorithms and make
 it run faster:</p>

 <ul>
-<li>MachO writer works.</li>
+<li></li>
+
+  MachineCSE tuned and on by default.
+
+  Rewrote tblgen's type inference for backends to be more consistent and
+     diagnose more target bugs.  This also allows limited support for writing
+     patterns for instructions that return multiple results, e.g. a virtual
+     register and a flag result.  Stuff that used 'parallel' before should use
+     this.
+
+  New -regalloc=fast,  =local got removed
+  New -regalloc=default option that chooses a register allocator based on the -O optimization level.
+  New SubRegIndex tblgen class for targets -> jakob
+
+  Bottom up fast isel.  Simple Load reuse.  No more machinedce.
+  IR ABI: <3 x float> is passed as <4 x float> instead of 3 floats.
+
+  New COPY instruction. copyRegToReg -> copyPhysReg, isMoveInstr is gone.
+  RenderMachineFunction: -rendermf
+  SplitKit?
+  Evan: Teach bottom up pre-ra scheduler to track register pressure. Work in progress.
+  Evan: Add an ILP scheduler.  On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%.
+
+  New OptimizeExts+OptimizeCmps -> PeepholeOptimizer pass
+  New LocalStackSlotAllocation.cpp pass (jimg)
+  Atomics now get legalized when not natively supported (jim g)
+
+  -ffunction-sections and -fdata-sections are supported on ELF targets.
+  -momit-leaf-frame-pointer now supported.
+
 </ul>
 </div>

@ -689,6 +671,30 @@ it run faster:</p>
    in registers across basic blocks, dramatically improving performance of code
    that uses long double, and when targetting CPUs that don't support SSE.</li>

+  New SSEDomainFix pass: 
+    On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a
+    register in a different domain than where it was defined. Some instructions
+    have equvivalents for different domains, like por/orps/orpd.  The
+    SSEDomainFix pass tries to minimize the number of domain crossings by
+    changing between equvivalent opcodes where possible.
+
+  X86 backend attempts to promote 16-bit integer operations to 32-bits to avoid
+     0x66 prefixes, which are slow on some microarchitectures and bloat the code
+     on others.
+
+  New support for X86 "thiscall" calling convention (x86_thiscallcc in IR) for windows.
+
+  New llvm.x86.int intrinsic (for int $42 and int3)
+
+  Verbose assembly decodes X86 shuffle instructions, e.g.:
+  	insertps	$113, %xmm3, %xmm0     ## xmm0 = zero,xmm0[1,2],xmm3[1]
+	unpcklps	%xmm1, %xmm0    ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
+	pshufd	$1, %xmm1, %xmm1        ## xmm1 = xmm1[1,0,0,0]
+        
+  X86 ABI:  <2 x float> in IR no longer maps onto MMX, it turns into <4 x float>
+
+  new GHC calling convention
+
 </ul>

 </div>
@ -704,6 +710,14 @@ it run faster:</p>

 <ul>

+  NEON: Better performance for QQQQ (4-consecutive Q register) instructions.  New reg sequence abstraction?
+  ARM: Better scheduling (list-hybrid, hybrid?)
+  ARM: Tail call support.
+  ARM: General performance work and tuning.
+
+  ARM: Half float support through intrinsics LangRef.html#int_fp16
+<li>ARMGlobalMerge: <!-- Anton --> </li>
+
 <li>
  All of the NEON load and store intrinsics (llvm.arm.neon.vld* and
  llvm.arm.neon.vst*) take an extra parameter to specify the alignment in bytes
@ -795,17 +809,22 @@ it run faster:</p>
 on LLVM 2.7, this section lists some "gotchas" that you may run into upgrading
 from the previous release.</p>

+
+  renamed "Release" -> "Release+Asserts"; "Release-Asserts" -> "Release etc.
+  RegisterPass<> -> INTIALIZE_PASS()
+
+
 <ul>
 <li>.ll file doesn't produce #uses comments anymore, to get them, run a .bc file
   through "llvm-dis --show-annotations".</li>
 <li>MSIL Backend removed.</li>
 <li>ABCD and SSI passes removed.</li>
 <li>'Union' LLVM IR feature removed.</li>
+<li>SCCVN pass removed.</li>
 </ul>

 <p>In addition, many APIs have changed in this release.  Some of the major LLVM
 API changes are:</p>
-
 <ul>
 </ul>

@ -844,8 +863,8 @@ href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev list</a>.</p>
 <ul>
 <li>The Alpha, SPU, MIPS, PIC16, Blackfin, MSP430, SystemZ and MicroBlaze
    backends are experimental.</li>
-<li><tt>llc</tt> "<tt>-filetype=asm</tt>" (the default) is the only
-    supported value for this option.  XXX Update me</li>
+<li><tt>llc</tt> "<tt>-filetype=obj</tt>" is experimental on all targets
+    other than darwin-i386 and darwin-x86_64.</li>
 </ul>

 </div>