Skip to content

Commit 8ed2b52

Browse files
authored
fix: integration / Archery test With other arrows container ran out of space (#9043)
# Which issue does this PR close? - Closes #9024. # Rationale for this change the ci container starts with 63gb / 72gb used, the 9GB remaining disk space is barely enough for a cross build in 7 languages that leads to ci being stuck. this is what a debug step after initialize container shows === CONTAINER DISK USAGE === Filesystem Size Used Avail Use% Mounted on overlay 72G 63G 9.5G 87% / # What changes are included in this PR? - add resource monitoring to build process - add a clean up step to remove unnecessary software (cuts 6GB of space) === Cleaning up host disk space === Disk space before cleanup: Filesystem Size Used Avail Use% Mounted on overlay 72G 63G 9.5G 87% / Disk space after cleanup: Filesystem Size Used Avail Use% Mounted on overlay 72G 57G 16G 79% / - add a small optimization to shallow clone (only clone most recent commit not full history) for github repos optimization results we have 6.1 GB left after build === After Build === Filesystem Size Used Avail Use% Mounted on overlay 72G 66G 6.1G 92% / # Are these changes tested? tested by github ci # Are there any user-facing changes? no --------- Signed-off-by: lyang24 <lanqingy93@gmail.com>
1 parent b1ddc24 commit 8ed2b52

1 file changed

Lines changed: 60 additions & 6 deletions

File tree

.github/workflows/integration.yml

Lines changed: 60 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -78,58 +78,112 @@ jobs:
7878
run:
7979
shell: bash
8080
steps:
81+
- name: Monitor disk usage - Initial
82+
run: |
83+
echo "=== Initial Disk Usage ==="
84+
df -h /
85+
echo ""
86+
87+
- name: Remove unnecessary preinstalled software
88+
run: |
89+
echo "=== Cleaning up host disk space ==="
90+
echo "Disk space before cleanup:"
91+
df -h /
92+
93+
# Clean apt cache
94+
apt-get clean || true
95+
96+
# Remove GitHub Actions tool cache
97+
rm -rf /__t/* || true
98+
99+
# Remove large packages from host filesystem (mounted at /host/)
100+
rm -rf /host/usr/share/dotnet || true
101+
rm -rf /host/usr/local/lib/android || true
102+
rm -rf /host/usr/local/.ghcup || true
103+
rm -rf /host/opt/hostedtoolcache/CodeQL || true
104+
105+
echo ""
106+
echo "Disk space after cleanup:"
107+
df -h /
108+
echo ""
109+
81110
# This is necessary so that actions/checkout can find git
82111
- name: Export conda path
83112
run: echo "/opt/conda/envs/arrow/bin" >> $GITHUB_PATH
84113
# This is necessary so that Rust can find cargo
85114
- name: Export cargo path
86115
run: echo "/root/.cargo/bin" >> $GITHUB_PATH
87-
- name: Check rustup
88-
run: which rustup
89-
- name: Check cmake
90-
run: which cmake
116+
117+
# Checkout repos (using shallow clones with fetch-depth: 1)
91118
- name: Checkout Arrow
92119
uses: actions/checkout@v6
93120
with:
94121
repository: apache/arrow
95122
submodules: true
96-
fetch-depth: 0
123+
fetch-depth: 1
97124
- name: Checkout Arrow Rust
98125
uses: actions/checkout@v6
99126
with:
100127
path: rust
101128
submodules: true
102-
fetch-depth: 0
129+
fetch-depth: 1
103130
- name: Checkout Arrow .NET
104131
uses: actions/checkout@v6
105132
with:
106133
repository: apache/arrow-dotnet
107134
path: dotnet
135+
fetch-depth: 1
108136
- name: Checkout Arrow Go
109137
uses: actions/checkout@v6
110138
with:
111139
repository: apache/arrow-go
112140
path: go
141+
fetch-depth: 1
113142
- name: Checkout Arrow Java
114143
uses: actions/checkout@v6
115144
with:
116145
repository: apache/arrow-java
117146
path: java
147+
fetch-depth: 1
118148
- name: Checkout Arrow JavaScript
119149
uses: actions/checkout@v6
120150
with:
121151
repository: apache/arrow-js
122152
path: js
153+
fetch-depth: 1
123154
- name: Checkout Arrow nanoarrow
124155
uses: actions/checkout@v6
125156
with:
126157
repository: apache/arrow-nanoarrow
127158
path: nanoarrow
159+
fetch-depth: 1
160+
161+
- name: Monitor disk usage - After checkouts
162+
run: |
163+
echo "=== After Checkouts ==="
164+
df -h /
165+
echo ""
166+
128167
- name: Build
129168
run: conda run --no-capture-output ci/scripts/integration_arrow_build.sh $PWD /build
169+
170+
- name: Monitor disk usage - After build
171+
if: always()
172+
run: |
173+
echo "=== After Build ==="
174+
df -h /
175+
echo ""
176+
130177
- name: Run
131178
run: conda run --no-capture-output ci/scripts/integration_arrow.sh $PWD /build
132179

180+
- name: Monitor disk usage - After tests
181+
if: always()
182+
run: |
183+
echo "=== After Tests ==="
184+
df -h /
185+
echo ""
186+
133187
# test FFI against the C-Data interface exposed by pyarrow
134188
pyarrow-integration-test:
135189
name: Pyarrow C Data Interface

0 commit comments

Comments
 (0)