Skip to content

sync with upstream#2

Open
jcphill wants to merge 38 commits intoncsa:mainfrom
r-ccs-cms:main
Open

sync with upstream#2
jcphill wants to merge 38 commits intoncsa:mainfrom
r-ccs-cms:main

Conversation

@jcphill
Copy link
Copy Markdown

@jcphill jcphill commented May 4, 2026

No description provided.

walkup and others added 30 commits January 23, 2026 11:38
… mult.h, save memory in GenerateExcitation() helper.h
…ic/out_of_place_func/davidson.h and framework/mpi_utility.h
Implements Hamiltonian-vector multiplication on GPU using OpenMP target offload
with determinant precomputation cache. Includes comprehensive test suite.

Key features:
- OpenMP target offload for GPU acceleration
- Determinant cache to eliminate redundant computation
- Batched Hij computation on device
- MPI parallelization support
- Vendor-neutral build system (ROCM_PATH)
- Complete functionality test suite

Co-Authored-By: Claude <noreply@anthropic.com>
Prevents undefined behavior if FCIDUMP header is corrupt and NORB field
is missing. L=0 is safe as it results in empty determinant vectors.
This commit refines the changes from PR #1 (commits 8eada1a and adceb9d)
to improve build system flexibility and fix several bugs:

Makefile improvements:
- Add CXXFLAGS/LDFLAGS command-line override support while preserving
  required flags (-fopenmp, -std=c++17, -funroll-loops, etc.)
- User can now override flags: make GPU=1 CXXFLAGS="-I/custom/path"
- Required flags are always prepended to user-provided flags
- Fix GPU variable pass-through from parent Makefile to src/Makefile
- Add GPU ?= 1 default in parent Makefile

Bug fixes:
- Remove 'clean' from target prerequisites (was incorrectly linking it)
- Use $(TARGET) and $(OBJECTS) in clean rule instead of hardcoded names
- Move final CXXFLAGS/LDFLAGS override after all conditional modifications
  so GPU=1/GPU=0 actually take effect
- Add CXXFLAGS and LDFLAGS to verbose build output

Code cleanup:
- Remove "from Phase 1" development comment in hij_omp_offload.h
- Add clang-format guards around ASCII art in helper.h
- Add clarifying comment about GPU device selection in main.cc

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add missing #include <vector> in mpi_utility.h
- Fix helper.h: typo in SinglesFromAlphaOffset[i], correct CrAn flat
  array sizing (2x for singles, 4x for doubles) and flattening loops
- Fix mult.h: correct for-loop init, use Wb_ptr/T_ptr raw pointers
  instead of std::vector in target regions, add missing map clauses
  for CrAn arrays and det_cache_ptr, fix TwoExcite_device call
  signature (7 args), fix #ifdef typo, use int for CrAn variables
- Fix sbdiag.h: add missing #endif for SBD_THRUST block, unify
  USE_HIJ_OMP_OFFLOAD -> USE_OMP_OFFLOAD for integral GPU mapping
- Fix main.cc: match 13-arg diag() signature with co_adet/co_bdet
- Remove obsolete hij_omp_offload.h (replaced by omp_offload.h)
- Update test and Makefile flags to use USE_OMP_OFFLOAD consistently

Co-Authored-By: Claude <noreply@anthropic.com>
Made-with: Cursor
suggested code changes including elimination of the determinant cache
added GPU offload for reduced density matrix computation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants