Skip to content
Snippets Groups Projects
  1. Jun 12, 2017
  2. Jun 01, 2017
  3. May 01, 2017
  4. Apr 26, 2017
  5. Apr 19, 2017
  6. Mar 30, 2017
  7. Mar 23, 2017
  8. Mar 21, 2017
  9. Feb 23, 2017
  10. Feb 11, 2017
  11. Jan 25, 2017
    • Justin Lebar's avatar
      [CodeGen] [CUDA] Add the ability set default attrs on functions in linked modules. · 771d6cd1
      Justin Lebar authored
      Summary:
      Now when you ask clang to link in a bitcode module, you can tell it to
      set attributes on that module's functions to match what we would have
      set if we'd emitted those functions ourselves.
      
      This is particularly important for fast-math attributes in CUDA
      compilations.
      
      Each CUDA compilation links in libdevice, a bitcode library provided by
      nvidia as part of the CUDA distribution.  Without this patch, if we have
      a user-function F that is compiled with -ffast-math that calls a
      function G from libdevice, F will have the unsafe-fp-math=true (etc.)
      attributes, but G will have no attributes.
      
      Since F calls G, the inliner will merge G's attributes into F's.  It
      considers the lack of an unsafe-fp-math=true attribute on G to be
      tantamount to unsafe-fp-math=false, so it "merges" these by setting
      unsafe-fp-math=false on F.
      
      This then continues up the call graph, until every function that
      (transitively) calls something in libdevice gets unsafe-fp-math=false
      set, thus disabling fastmath in almost all CUDA code.
      
      Reviewers: echristo
      
      Subscribers: hfinkel, llvm-commits, mehdi_amini
      
      Differential Revision: https://reviews.llvm.org/D28538
      
      git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@293097 91177308-0d34-0410-b5e6-96231b3b80d8
      771d6cd1
  12. Jan 18, 2017
  13. Dec 28, 2016
    • George Burgess IV's avatar
      [CodeGen] Unique constant CompoundLiterals. · 75424358
      George Burgess IV authored
      Our newly aggressive constant folding logic makes it possible for
      CGExprConstant to see the same CompoundLiteralExpr more than once. So,
      emitting a new GlobalVariable every time we see a CompoundLiteral is no
      longer correct.
      
      We had a similar issue with BlockExprs that was caught while testing
      said aggressive folding, so I applied the same style of fix (see D26410)
      here. If we find yet another case where this needs to happen, we should
      probably refactor this so we don't have a third DenseMap+getter+setter.
      
      As a design note: getAddrOfConstantCompoundLiteralIfEmitted is really
      only intended to be called by ConstExprEmitter::EmitLValue. So,
      returning a GlobalVariable* instead of a ConstantAddress costs us
      effectively nothing, and saves us either a few bytes per entry in our
      map or a bit of code duplication.
      
      
      git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@290661 91177308-0d34-0410-b5e6-96231b3b80d8
      75424358
  14. Dec 22, 2016
    • George Burgess IV's avatar
      Add the alloc_size attribute to clang, attempt 2. · aa365cb2
      George Burgess IV authored
      This is a recommit of r290149, which was reverted in r290169 due to msan
      failures. msan was failing because we were calling
      `isMostDerivedAnUnsizedArray` on an invalid designator, which caused us
      to read uninitialized memory. To fix this, the logic of the caller of
      said function was simplified, and we now have a `!Invalid` assert in
      `isMostDerivedAnUnsizedArray`, so we can catch this particular bug more
      easily in the future.
      
      Fingers crossed that this patch sticks this time. :)
      
      Original commit message:
      
      This patch does three things:
      - Gives us the alloc_size attribute in clang, which lets us infer the
        number of bytes handed back to us by malloc/realloc/calloc/any user
        functions that act in a similar manner.
      - Teaches our constexpr evaluator that evaluating some `const` variables
        is OK sometimes. This is why we have a change in
        test/SemaCXX/constant-expression-cxx11.cpp and other seemingly
        unrelated tests. Richard Smith okay'ed this idea some time ago in
        person.
      - Uniques some Blocks in CodeGen, which was reviewed separately at
        D26410. Lack of uniquing only really shows up as a problem when
        combined with our new eagerness in the face of const.
      
      
      git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@290297 91177308-0d34-0410-b5e6-96231b3b80d8
      aa365cb2
  15. Dec 20, 2016
    • Chandler Carruth's avatar
      Revert r290149: Add the alloc_size attribute to clang. · 4e57f52f
      Chandler Carruth authored
      This commit fails MSan when running test/CodeGen/object-size.c in
      a confusing way. After some discussion with George, it isn't really
      clear what is going on here. We can make the MSan failure go away by
      testing for the invalid bit, but *why* things are invalid isn't clear.
      And yet, other code in the surrounding area is doing precisely this and
      testing for invalid.
      
      George is going to take a closer look at this to better understand the
      nature of the failure and recommit it, for now backing it out to clean
      up MSan builds.
      
      git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@290169 91177308-0d34-0410-b5e6-96231b3b80d8
      4e57f52f
    • George Burgess IV's avatar
      Add the alloc_size attribute to clang. · 598b6770
      George Burgess IV authored
      This patch does three things:
      
      - Gives us the alloc_size attribute in clang, which lets us infer the
        number of bytes handed back to us by malloc/realloc/calloc/any user
        functions that act in a similar manner.
      - Teaches our constexpr evaluator that evaluating some `const` variables
        is OK sometimes. This is why we have a change in
        test/SemaCXX/constant-expression-cxx11.cpp and other seemingly
        unrelated tests. Richard Smith okay'ed this idea some time ago in
        person.
      - Uniques some Blocks in CodeGen, which was reviewed separately at
        D26410. Lack of uniquing only really shows up as a problem when
        combined with our new eagerness in the face of const.
      
      Differential Revision: https://reviews.llvm.org/D14274
      
      
      git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@290149 91177308-0d34-0410-b5e6-96231b3b80d8
      598b6770
  16. Dec 15, 2016
    • Yaxun Liu's avatar
      Re-commit r289252 and r289285, and fix PR31374 · fbed33de
      Yaxun Liu authored
      git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@289787 91177308-0d34-0410-b5e6-96231b3b80d8
      fbed33de
    • Saleem Abdulrasool's avatar
      CodeGen: fix runtime function dll storage · e13e0e40
      Saleem Abdulrasool authored
      Properly attribute DLL storage to runtime functions.  When generating the
      runtime function, scan for an existing declaration which may provide an explicit
      declaration (local storage) or a DLL import or export storage from the user.
      Honour that if available.  Otherwise, if building with a local visibility of the
      public or standard namespaces (-flto-visibility-public-std), give the symbols
      local storage (it indicates a /MT[d] link, so static runtime).  Otherwise,
      assume that the link is dynamic, and give the runtime function dllimport
      storage.
      
      This allows for implementations to get the correct storage as long as they are
      properly declared, the user to override the import storage, and in case no
      explicit storage is given, use of the import storage.
      
      git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@289776 91177308-0d34-0410-b5e6-96231b3b80d8
      e13e0e40
  17. Dec 14, 2016
  18. Dec 09, 2016
    • Yaxun Liu's avatar
      Add support for non-zero null pointer for C and OpenCL · b6215483
      Yaxun Liu authored
      In amdgcn target, null pointers in global, constant, and generic address space take value 0 but null pointers in private and local address space take value -1. Currently LLVM assumes all null pointers take value 0, which results in incorrectly translated IR. To workaround this issue, instead of emit null pointers in local and private address space, a null pointer in generic address space is emitted and casted to local and private address space.
      
      Tentative definition of global variables with non-zero initializer will have weak linkage instead of common linkage since common linkage requires zero initializer and does not have explicit section to hold the non-zero value.
      
      Virtual member functions getNullPointer and performAddrSpaceCast are added to TargetCodeGenInfo which by default returns ConstantPointerNull and emitting addrspacecast instruction. A virtual member function getNullPointerValue is added to TargetInfo which by default returns 0. Each target can override these virtual functions to get target specific null pointer and the null pointer value for specific address space, and perform specific translations for addrspacecast.
      
      Wrapper functions getNullPointer is added to CodegenModule and getTargetNullPointerValue is added to ASTContext to facilitate getting the target specific null pointers and their values.
      
      This change has no effect on other targets except amdgcn target. Other targets can provide support of non-zero null pointer in a similar way.
      
      This change only provides support for non-zero null pointer for C and OpenCL. Supporting for other languages will be added later incrementally.
      
      Differential Revision: https://reviews.llvm.org/D26196
      
      
      git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@289252 91177308-0d34-0410-b5e6-96231b3b80d8
      b6215483
  19. Dec 01, 2016
  20. Nov 03, 2016
  21. Oct 27, 2016
  22. Oct 14, 2016
  23. Oct 13, 2016
    • Justin Lebar's avatar
      [CUDA] Emit deferred diagnostics during Sema rather than during codegen. · 45b902e6
      Justin Lebar authored
      Summary:
      Emitting deferred diagnostics during codegen was a hack.  It did work,
      but usability was poor, both for us as compiler devs and for users.  We
      don't codegen if there are any sema errors, so for users this meant that
      they wouldn't see deferred errors if there were any non-deferred errors.
      For devs, this meant that we had to carefully split up our tests so that
      when we tested deferred errors, we didn't emit any non-deferred errors.
      
      This change moves checking for deferred errors into Sema.  See the big
      comment in SemaCUDA.cpp for an overview of the idea.
      
      This checking adds overhead to compilation, because we have to maintain
      a partial call graph.  As a result, this change makes deferred errors a
      CUDA-only concept (whereas before they were a general concept).  If
      anyone else wants to use this framework for something other than CUDA,
      we can generalize at that time.
      
      This patch makes the minimal set of test changes -- after this lands,
      I'll go back through and do a cleanup of the tests that we no longer
      have to split up.
      
      Reviewers: rnk
      
      Subscribers: cfe-commits, rsmith, tra
      
      Differential Revision: https://reviews.llvm.org/D25541
      
      git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284158 91177308-0d34-0410-b5e6-96231b3b80d8
      45b902e6
  24. Sep 14, 2016
  25. Sep 09, 2016
  26. Aug 15, 2016
  27. Jul 28, 2016
    • Yaxun Liu's avatar
      [OpenCL] Generate opaque type for sampler_t and function call for the initializer · 427517d1
      Yaxun Liu authored
      Currently Clang use int32 to represent sampler_t, which have been a source of issue for some backends, because in some backends sampler_t cannot be represented by int32. They have to depend on kernel argument metadata and use IPA to find the sampler arguments and global variables and transform them to target specific sampler type.
      
      This patch uses opaque pointer type opencl.sampler_t* for sampler_t. For each use of file-scope sampler variable, it generates a function call of __translate_sampler_initializer. For each initialization of function-scope sampler variable, it generates a function call of __translate_sampler_initializer.
      
      Each builtin library can implement its own __translate_sampler_initializer(). Since the real sampler type tends to be architecture dependent, allowing it to be initialized by a library function simplifies backend design. A typical implementation of __translate_sampler_initializer could be a table lookup of real sampler literal values. Since its argument is always a literal, the returned pointer is known at compile time and easily optimized to finally become some literal values directly put into image read instructions.
      
      This patch is partially based on Alexey Sotkin's work in Khronos Clang (https://github.com/KhronosGroup/SPIR/commit/3d4eec61623502fc306e8c67c9868be2b136e42b).
      
      Differential Revision: https://reviews.llvm.org/D21567
      
      git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@277024 91177308-0d34-0410-b5e6-96231b3b80d8
      427517d1
  28. Jun 24, 2016
  29. Apr 28, 2016
  30. Apr 27, 2016
    • Peter Collingbourne's avatar
      Rework interface for bitset-using features to use a notion of LTO visibility. · 47213cf9
      Peter Collingbourne authored
      Bitsets, and the compiler features they rely on (vtable opt, CFI),
      only have visibility within the LTO'd part of the linkage unit. Therefore,
      only enable these features for classes with hidden LTO visibility. This
      notion is based on object file visibility or (on Windows)
      dllimport/dllexport attributes.
      
      We provide the [[clang::lto_visibility_public]] attribute to override the
      compiler's LTO visibility inference in cases where the class is defined
      in the non-LTO'd part of the linkage unit, or where the ABI supports
      calling classes derived from abstract base classes with hidden visibility
      in other linkage units (e.g. COM on Windows).
      
      If the cross-DSO CFI mode is enabled, bitset checks are emitted even for
      classes with public LTO visibility, as that mode uses a separate mechanism
      to cause bitsets to be exported.
      
      This mechanism replaces the whole-program-vtables blacklist, so remove the
      -fwhole-program-vtables-blacklist flag.
      
      Because __declspec(uuid()) now implies [[clang::lto_visibility_public]], the
      support for the special attr:uuid blacklist entry is removed.
      
      Differential Revision: http://reviews.llvm.org/D18635
      
      git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@267784 91177308-0d34-0410-b5e6-96231b3b80d8
      47213cf9
  31. Apr 14, 2016
  32. Apr 11, 2016
Loading