Commit Graph

173 Commits

Author SHA1 Message Date
Evan Cheng 7fae11b231 - Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
  and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
  prevent IT blocks from being broken apart.

llvm-svn: 146542
2011-12-14 02:11:42 +00:00
Nick Lewycky 50f02cb21b Move global variables in TargetMachine into new TargetOptions class. As an API
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.

One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.

llvm-svn: 145714
2011-12-02 22:16:29 +00:00
Evan Cheng ecb2908bf9 Sink codegen optimization level into MCCodeGenInfo along side relocation model
and code model. This eliminates the need to pass OptLevel flag all over the
place and makes it possible for any codegen pass to use this information.

llvm-svn: 144788
2011-11-16 08:38:26 +00:00
Devang Patel 76c8563239 svn mv Target/ARM/ARMGlobalMerge.cpp Transforms/Scalar/GlobalMerge.cpp
There is no reason to have simple IR level pass in lib/Target.

llvm-svn: 142200
2011-10-17 17:17:43 +00:00
Lang Hames de7ab801cc Add a natural stack alignment field to TargetData, and prevent InstCombine from
promoting allocas to preferred alignments that exceed the natural
alignment. This avoids some potentially expensive dynamic stack realignments.

The natural stack alignment is set in target data strings via the "S<size>"
option. Size is in bits and must be a multiple of 8. The natural stack alignment
defaults to "unspecified" (represented by a zero value), and the "unspecified"
value does not prevent any alignment promotions. Target maintainers that care
about avoiding promotions should explicitly add the "S<size>" option to their
target data strings.

llvm-svn: 141599
2011-10-10 23:42:08 +00:00
Jakob Stoklund Olesen f7ad189033 Use ExecutionDepsFix instead of NEONMoveFix.
This enables NEON domain tracking across basic blocks, but should
otherwise do the same thing.

llvm-svn: 140772
2011-09-29 02:48:41 +00:00
Bill Wendling a0d5f268a9 Move to ISelLowering.
llvm-svn: 140754
2011-09-29 01:13:55 +00:00
Bill Wendling dfe5acd34e Ahem...actually *add* the ARMSjLjLowering pass to the pass manager.
llvm-svn: 140718
2011-09-28 20:29:28 +00:00
Bill Wendling 354ff9e348 This is the start of the new SjLj EH preparation pass, which will replace the
current IR-level pass.

The old SjLj EH pass has some problems, especially with the new EH model. Most
significantly, it violates some of the new restrictions the new model has. For
instance, the 'dispatch' table wants to jump to the landing pad, but we cannot
allow that because only an invoke's unwind edge can jump to a landing pad. This
requires us to mangle the code something awful. In addition, we need to keep the
now dead landingpad instructions around instead of CSE'ing them because the
DWARF emitter uses that information (they are dead because no control flow edge
will execute them - the control flow edge from an invoke's unwind is superceded
by the edge coming from the dispatch).

Basically, this pass belongs not at the IR level where SSA is king, but at the
code-gen level, where we have more flexibility.

llvm-svn: 140646
2011-09-27 22:14:12 +00:00
Evan Cheng 9dad430486 Hide -global-merge option.
llvm-svn: 138540
2011-08-25 01:22:49 +00:00
Evan Cheng f066b2fe99 Add a command line option to disable global merge pass.
llvm-svn: 138536
2011-08-25 01:00:36 +00:00
Evan Cheng 3ca20e64ac Remove a out-of-place comment.
llvm-svn: 138534
2011-08-25 00:54:42 +00:00
Evan Cheng 2bb4035707 Move TargetRegistry and TargetSelect from Target to Support where they belong.
These are strictly utilities for registering targets and components.

llvm-svn: 138450
2011-08-24 18:08:43 +00:00
Evan Cheng ad5f485957 Sink ARM mc routines into MCTargetDesc.
llvm-svn: 135825
2011-07-23 00:00:19 +00:00
Evan Cheng efd9b4240f - Move CodeModel from a TargetMachine global option to MCCodeGenInfo.
- Introduce JITDefault code model. This tells targets to set different default
  code model for JIT. This eliminates the ugly hack in TargetMachine where
  code model is changed after construction.

llvm-svn: 135580
2011-07-20 07:51:56 +00:00
Evan Cheng 2129f59637 Introduce MCCodeGenInfo, which keeps information that can affect codegen
(including compilation, assembly). Move relocation model Reloc::Model from
TargetMachine to MCCodeGenInfo so it's accessible even without TargetMachine.

llvm-svn: 135468
2011-07-19 06:37:02 +00:00
Evan Cheng 1705ab00ab Rename createAsmInfo to createMCAsmInfo and move registration code to MCTargetDesc to prepare for next round of changes.
llvm-svn: 135219
2011-07-14 23:50:31 +00:00
Evan Cheng 4d1ca96bfc Eliminate asm parser's dependency on TargetMachine:
- Each target asm parser now creates its own MCSubtatgetInfo (if needed).
- Changed AssemblerPredicate to take subtarget features which tablegen uses
  to generate asm matcher subtarget feature queries. e.g.
  "ModeThumb,FeatureThumb2" is translated to
  "(Bits & ModeThumb) != 0 && (Bits & FeatureThumb2) != 0".

llvm-svn: 134678
2011-07-08 01:53:10 +00:00
Evan Cheng 2bd65363a8 Factor ARM triple parsing out of ARMSubtarget. Another step towards making ARM subtarget info available to MC.
llvm-svn: 134569
2011-07-07 00:08:19 +00:00
Evan Cheng fe6e405e8c Fix the ridiculous SubtargetFeatures API where it implicitly expects CPU name to
be the first encoded as the first feature. It then uses the CPU name to look up
features / scheduling itineray even though clients know full well the CPU name
being used to query these properties.

The fix is to just have the clients explictly pass the CPU name!

llvm-svn: 134127
2011-06-30 01:53:36 +00:00
Evan Cheng 1b049f5851 Remove TargetOptions.h dependency from ARMSubtarget.
llvm-svn: 133738
2011-06-23 18:15:17 +00:00
Daniel Dunbar 2b9b0e3748 ADT/Triple: Move a variety of clients to using isOSDarwin() and isOSWindows()
predicates.

llvm-svn: 129816
2011-04-19 21:14:45 +00:00
Bob Wilson 0858c3aaed This patch combines several changes from Evan Cheng for rdar://8659675.
Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Enable these fp vmlx codegen changes for Cortex-A9.

llvm-svn: 129775
2011-04-19 18:11:57 +00:00
Jim Grosbach 6ade7e0bac Tidy up.
llvm-svn: 129034
2011-04-06 22:35:47 +00:00
NAKAMURA Takumi 4c14a5cc2c Triple::MinGW64 is deprecated and removed. We can use Triple::MinGW32 generally.
No one uses *-mingw64. mingw-w64 is represented as {i686|x86_64}-w64-mingw32. In llvm side, i686 and x64 can be treated as similar way.

llvm-svn: 125747
2011-02-17 12:24:17 +00:00
Rafael Espindola b3eca9bb71 Add support for the --noexecstack option.
llvm-svn: 124077
2011-01-23 17:55:27 +00:00
Anton Korobeynikov 2f93128109 Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there.
llvm-svn: 123170
2011-01-10 12:39:04 +00:00
Evan Cheng 62c7b5bf76 Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Work in progress, only A+B are enabled.

llvm-svn: 120960
2010-12-05 22:04:16 +00:00
Chris Lattner 9fdd10dbce tidy up
llvm-svn: 119462
2010-11-17 05:41:32 +00:00
Anton Korobeynikov f7183edb59 First step of huge frame-related refactoring: move emit{Prologue,Epilogue} out of TargetRegisterInfo to TargetFrameInfo, which is definitely much better suitable place
llvm-svn: 119097
2010-11-15 00:06:54 +00:00
Eric Christopher 7ae11c6962 Revert the accidental commit I made reverting the previous commit.
llvm-svn: 118835
2010-11-11 20:50:14 +00:00
Eric Christopher b90f7004cf Revert this temporarily.
llvm-svn: 118827
2010-11-11 19:47:02 +00:00
Rafael Espindola 66e08d43d2 Jim Asked us to move DataLayout on ARM back to the most specialized classes. Do
so and also change X86 for consistency.

Investigating if this can be improved a bit.

llvm-svn: 115469
2010-10-03 18:59:45 +00:00
Jason W Kim b32124545b I added a new file ARMAsmBackend which stubs out in similar ways to
the eqv X86 class.
For now, I split the ELFARMAsmBackend from the DarwinARMAsmBackend
(also mimicking X86)

Tested against -r115126

llvm-svn: 115129
2010-09-30 02:17:26 +00:00
Nick Lewycky 7d483d352b Resolve this GCC warning:
ARMTargetMachine.cpp:53: error: control reaches end of non-void function

llvm-svn: 114992
2010-09-28 21:40:26 +00:00
Rafael Espindola 69aa15155f Odd additional stub framework for the ARM MC ELF emission.
llc now recognizes the "intent" to support MC/obj emission for ARM, but
given that they are all stubs, it asserts on --filetype=obj --march=arm

Patch by Jason Kim.

llvm-svn: 114856
2010-09-27 18:31:37 +00:00
Bob Wilson c597fd3b4a Convert some VTBL and VTBX instructions to use pseudo instructions prior to
register allocation.  Remove the NEONPreAllocPass, which is no longer needed.
Yeah!!

llvm-svn: 113818
2010-09-13 23:55:10 +00:00
Evan Cheng 5190f09291 Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors.
llvm-svn: 110798
2010-08-11 07:17:46 +00:00
Evan Cheng ce8fb68078 Change -prefer-32bit-thumb to attribute -mattr=+32bit instead to disable more 32-bit to 16-bit optimizations.
llvm-svn: 110584
2010-08-09 18:35:19 +00:00
Evan Cheng 7d8d9a5dd5 Add an option to disable 32 -> 16-bit Thumb2 size reduction pass for experimentation.
llvm-svn: 110579
2010-08-09 17:16:10 +00:00
Anton Korobeynikov 19edda0323 Hook in GlobalMerge pass
llvm-svn: 109359
2010-07-24 21:52:08 +00:00
Evan Cheng c3525dc0fd Remove early IT block formation. It's not used.
llvm-svn: 107513
2010-07-02 21:07:09 +00:00
Bob Wilson 07aead2f8d Add missing ARM and Thumb data layout info for vector types.
Radar 8128745.

llvm-svn: 106820
2010-06-25 04:41:08 +00:00
Evan Cheng c26e2f4b70 Oops. IT block formation pass needs to be run at any optimization level.
llvm-svn: 106775
2010-06-24 19:10:14 +00:00
Evan Cheng 119824ed4d Move ARM if-conversion before post-ra scheduling.
llvm-svn: 106355
2010-06-18 23:32:07 +00:00
Evan Cheng 2d51c7c592 Allow ARM if-converter to be run after post allocation scheduling.
- This fixed a number of bugs in if-converter, tail merging, and post-allocation
  scheduler. If-converter now runs branch folding / tail merging first to
  maximize if-conversion opportunities.
- Also changed the t2IT instruction slightly. It now defines the ITSTATE
  register which is read by instructions in the IT block.
- Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't
  change the instruction ordering in the IT block (since IT mask has been
  finalized). It also ensures no other instructions can be scheduled between
  instructions in the IT block.

This is not yet enabled.

llvm-svn: 106344
2010-06-18 23:09:54 +00:00
Evan Cheng f128bdcb55 Make post-ra scheduling, anti-dep breaking, and register scavenger (conservatively) aware of predicated instructions. This enables ARM to move if-conversion before post-ra scheduler.
llvm-svn: 106091
2010-06-16 07:35:02 +00:00
Evan Cheng 83c64ee8de Typo.
llvm-svn: 105677
2010-06-09 03:49:12 +00:00
Evan Cheng 47cd593023 Thumb2 IT blocks are fairly expensive. When there are multiple selects using
the same condition, it's important to make sure they are scheduled together
to avoid forming multiple IT blocks. I'm adding a pre-regalloc pass that forms
IT blocks early (by re-scheduling instructions and split basic blocks) to
attempt to fix this. This is not turned on by default since I am not sure this
is the right fix.

Another issue is llvm selects are modeled as two-address conditional moves.
This can be very bad when the copies before the conditional moves are not
coalesced away. Teach IT formation pass to move the copies above the IT block
(when legal) to avoid breaking the IT block.

llvm-svn: 105669
2010-06-09 01:46:50 +00:00
Dan Gohman bb919dfb6b Implement a bunch more TargetSelectionDAGInfo infrastructure.
Move EmitTargetCodeForMemcpy, EmitTargetCodeForMemset, and
EmitTargetCodeForMemmove out of TargetLowering and into
SelectionDAGInfo to exercise this.

llvm-svn: 103481
2010-05-11 17:31:57 +00:00