Skip to content

qa: do not create rbd pool for CephFS QA#53037

Merged
vshankar merged 2 commits intoceph:mainfrom
batrick:i62482
Sep 4, 2023
Merged

qa: do not create rbd pool for CephFS QA#53037
vshankar merged 2 commits intoceph:mainfrom
batrick:i62482

Conversation

@batrick
Copy link
Member

@batrick batrick commented Aug 17, 2023

Fixes: https://tracker.ceph.com/issues/62482

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@batrick
Copy link
Member Author

batrick commented Aug 17, 2023

@batrick
Copy link
Member Author

batrick commented Aug 17, 2023

jenkins test make check

@batrick
Copy link
Member Author

batrick commented Aug 17, 2023

jenkins test make check arm64

@batrick
Copy link
Member Author

batrick commented Aug 18, 2023

jenkins test make check

@batrick
Copy link
Member Author

batrick commented Aug 18, 2023

jenkins test make check arm64

@batrick batrick requested a review from a team August 23, 2023 18:05
@dparmar18
Copy link
Contributor

we had a similar PR that ignored this warning #53077 but i think NOT creating the rbd pool is much better solution

Copy link
Contributor

@mchangir mchangir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good.
Although the config should've required explicit request of type of pools to be created rather than implicitly creating them and then requiring to explicitly disable their creation. This is so twisted.

@vshankar
Copy link
Contributor

@batrick We could probably revert changes from #53077? Could you include that in this change?

This reverts commit b8bf0c6, reversing
changes made to fe07f64.

Silencing this health warning is unnecessary if we stop creating the rbd pool
in CephFS testing.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Fixes: https://tracker.ceph.com/issues/62482
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
@batrick
Copy link
Member Author

batrick commented Aug 28, 2023

@vshankar done

@batrick
Copy link
Member Author

batrick commented Aug 29, 2023

jenkins test windows

batrick added a commit to batrick/ceph that referenced this pull request Aug 30, 2023
* refs/pull/53037/head:
	qa: do not create rbd pool for CephFS QA
	Revert "Merge PR ceph#53077 into main"
@vshankar
Copy link
Contributor

vshankar commented Sep 4, 2023

Tested with my mirroring changes: https://pulpito.ceph.com/?branch=wip-62072-3

@vshankar vshankar merged commit 27edb75 into ceph:main Sep 4, 2023
- \(MDS_ALL_DOWN\)
- \(MDS_UP_LESS_THAN_MAX\)
- \(FS_INLINE_DATA_DEPRECATED\)
- \(POOL_APP_NOT_ENABLED\)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I obviously agree with not creating an RBD pool for CephFS tests, I'm surprised to learn that @vshankar's run passed with POOL_APP_NOT_ENABLED dropped from the ignorelist. The issue is not (specific to) the RBD pool, rather it's the raising of a bogus health alert. See the recent thread on the dev list:

https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/ZTDYC5HN677RR26EB4P6PORN6L2IFH4R/

The RBD pool in this case is just the messenger -- it's tagged with the RBD application immediately:

ceph/qa/tasks/ceph.py

Lines 429 to 440 in 636d2a4

if config.get('create_rbd_pool', True):
log.info('Creating RBD pool')
mon_remote.run(
args=['sudo', 'ceph', '--cluster', cluster_name,
'osd', 'pool', 'create', 'rbd', '8'])
mon_remote.run(
args=[
'sudo', 'ceph', '--cluster', cluster_name,
'osd', 'pool', 'application', 'enable',
'rbd', 'rbd', '--yes-i-really-mean-it'
],
check_status=False)

In the example job linked by @batrick in https://tracker.ceph.com/issues/62482, POOL_APP_NOT_ENABLED is raised for CephFS pools a few seconds after getting raised on the RBD pool:

2023-08-16T11:24:28.204+0000 7fc702096640 20 mgr.server operator() health checks:                    
{                                                                                                        
    "POOL_APP_NOT_ENABLED": {                                                                                  
        "severity": "HEALTH_WARN",                                                                       
        "summary": {                                                                             
            "message": "1 pool(s) do not have an application enabled",                        
            "count": 1                                                                                        
        },                                                                                        
        "detail": [                                                                                
            {                                                                                                 
                "message": "application not enabled on pool 'rbd'"                               
            },                                                                                
            {                                                                                                 
                "message": "use 'ceph osd pool application enable <pool-name> <app-name>', where <app-name> is 'cephfs', 'rbd', 'rgw', or freeform for custom applications."
            }                                                                                                    
        ]                                                                                                     
    }                                                                                                                                                                       
}   
2023-08-16T11:24:34.204+0000 7fc702096640 20 mgr.server operator() health checks:                    
{                                                                                                        
    "POOL_APP_NOT_ENABLED": {                                                                                  
        "severity": "HEALTH_WARN",                                                                       
        "summary": {                                                                             
            "message": "2 pool(s) do not have an application enabled",                        
            "count": 2                                                                                        
        },                                                                                        
        "detail": [                                                                                
            {                                                                                                 
                "message": "application not enabled on pool 'cephfs_metadata'"                   
            },                                                                                
            {                                                                                                 
                "message": "application not enabled on pool 'cephfs_data'"                                                                                                  
            },                                                                                                   
            {                                                                                                 
                "message": "use 'ceph osd pool application enable <pool-name> <app-name>', where <app-name> is 'cephfs', 'rbd', 'rgw', or freeform for custom applications."
            }                                                                                                      
        ]                                                                                                          
    }                                                                                                                                                                         
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was an underlying RADOS change with a fix in-flight for that which maybe was merged when Venky do QA for this. Venky, do you have that link handy still?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was an underlying RADOS change with a fix in-flight for that which maybe was merged when Venky do QA for this. Venky, do you have that link handy still?

Unfortunately, no. I had run a pretty minimal set of test for this and that's probably the reason no failures related to the warning showed up -- just got unlucky I guess?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had run a pretty minimal set of test for this and that's probably the reason no failures related to the warning showed up -- just got unlucky I guess?

Yeah, this health alert became not only bogus but also racy...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants