Skip to content

Падения в TTieringActualizer::Refresh при переходе с ydb-stable-24-4-4-hotfix-2 на ydb-stable-25-1-1-3 #17184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dorooleg opened this issue Apr 14, 2025 · 13 comments · Fixed by #17729
Assignees

Comments

@dorooleg
Copy link
Collaborator

Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 0. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/util/system/yassert.cpp:83: NPrivate::InternalPanicImpl(int, char const*, char const*, int, int, int, TBasicStringBuf<char, std::__y1::char_traits<char>>, char const*, unsigned long) @ 0x55A220199DEB
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 1. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/util/system/yassert.cpp:55: NPrivate::Panic(NPrivate::TStaticBuf const&, int, char const*, char const*, char const*, ...) @ 0x55A220192E5B
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 2. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/log.cpp:754: NActors::TVerifyFormattedRecordWriter::~TVerifyFormattedRecordWriter() @ 0x55A220866F04
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 3. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/scheme/versions/abstract_scheme.cpp:172: NKikimr::NOlap::ISnapshotSchema::GetColumnId(std::__y1::basic_string<char, std::__y1::char_traits<char>, std::__y1::allocator<char>> const&) const @ 0x55A228B19756
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 4. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/storage/actualizer/tiering/tiering.cpp:239: NKikimr::NOlap::NActualizer::TTieringActualizer::Refresh(std::__y1::optional<NKikimr::NOlap::TTiering> const&, NKikimr::NOlap::NActualizer::TAddExternalContext const&) @ 0x55A22A9463A0
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 5. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/storage/actualizer/index/index.cpp:29: NKikimr::NOlap::NActualizer::TGranuleActualizationIndex::RefreshTiering(std::__y1::optional<NKikimr::NOlap::TTiering> const&, NKikimr::NOlap::NActualizer::TAddExternalContext const&) @ 0x55A22A93AE87
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 6. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/storage/granule/granule.h:186: NKikimr::NOlap::TGranuleMeta::RefreshTiering(std::__y1::optional<NKikimr::NOlap::TTiering> const&) @ 0x55A22AB41C48
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 7. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/column_engine_logs.cpp:503: NKikimr::NOlap::TColumnEngineForLogs::OnTieringModified(THashMap<unsigned long, NKikimr::NOlap::TTiering, THash<unsigned long>, TEqualTo<unsigned long>, std::__y1::allocator<unsigned long>> const&) @ 0x55A22AB4499B
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 8. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/columnshard_impl.cpp:1622: NKikimr::NColumnShard::TColumnShard::OnTieringModified(std::__y1::optional<unsigned long>) @ 0x55A22ACFAA69
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 9. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/columnshard.cpp:83: NKikimr::NColumnShard::TColumnShard::TrySwitchToWork(NActors::TActorContext const&) @ 0x55A22ABF1D1D
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 10. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/columnshard__init.cpp:110: NKikimr::NColumnShard::TTxInit::Complete(NActors::TActorContext const&) @ 0x55A22AC963EE
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 11. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_exec_seat.cpp:11: NKikimr::NTabletFlatExecutor::TSeat::Complete(NActors::TActorContext const&, bool) @ 0x55A22352C0D8
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 12. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor_txloglogic.cpp:82: NKikimr::NTabletFlatExecutor::CompleteRoTransaction(TAutoPtr<NKikimr::NTabletFlatExecutor::TSeat, TDelete>, NActors::TActorContext const&, NKikimr::NTabletFlatExecutor::TExecutorCounters*, NKikimr::TTabletCountersWithTxTypes*) @ 0x55A223523EB8
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 13. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor_txloglogic.cpp:98: NKikimr::NTabletFlatExecutor::TLogicRedo::CommitROTransaction(TAutoPtr<NKikimr::NTabletFlatExecutor::TSeat, TDelete>, NActors::TActorContext const&) @ 0x55A22352459D
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 14. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor.cpp:2090: NKikimr::NTabletFlatExecutor::TExecutor::CommitTransactionLog(TAutoPtr<NKikimr::NTabletFlatExecutor::TSeat, TDelete>, NKikimr::NTabletFlatExecutor::TPageCollectionTxEnv&, TAutoPtr<NKikimr::NTable::TChange, TDelete>, THPTimer&, NActors::TActorContext const&) @ 0x55A22347F8D6
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 15. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor.cpp:1833: NKikimr::NTabletFlatExecutor::TExecutor::ExecuteTransaction(TAutoPtr<NKikimr::NTabletFlatExecutor::TSeat, TDelete>, NActors::TActorContext const&) @ 0x55A22347EB8B
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 16. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor.cpp:2696: NKikimr::NTabletFlatExecutor::TExecutor::Handle(TAutoPtr<NActors::TEventHandle<NKikimr::NTabletFlatExecutor::TExecutor::TEvPrivate::TEvActivateExecution>, TDelete>&, NActors::TActorContext const&) @ 0x55A22348A2C9
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 17. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor.cpp:3970: NKikimr::NTabletFlatExecutor::TExecutor::StateWork(TAutoPtr<NActors::IEventHandle, TDelete>&) @ 0x55A22346BB41
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 18. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:269: NActors::TExecutorThread::Execute(NActors::TMailbox*, bool) @ 0x55A2208576D6
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 19. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:460: NActors::TExecutorThread::ProcessExecutorPool()::$_0::operator()(NActors::TMailbox*, bool) const @ 0x55A22085B6B5
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 20. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:512: NActors::TExecutorThread::ProcessExecutorPool() @ 0x55A22085B290
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 21. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:538: NActors::TExecutorThread::ThreadProc() @ 0x55A22085BE8F
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 22. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/util/system/thread.cpp:244: (anonymous namespace)::TPosixThread::ThreadProxy(void*) @ 0x55A22019EE07
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 23. /build/glibc-FcRMwW/glibc-2.31/nptl/pthread_create.c:477: start_thread @ 0x7FEE74AB2608
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[7348]: 24. ../sysdeps/unix/sysv/linux/x86_64/clone.S:95: ?? @ 0x7FEE749D7352
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw systemd[1]: kikimr.service: Main process exited, code=killed, status=6/ABRT
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw ydb_notify_restart[7474]: ERROR: ld.so: object 'libbreakpad_init.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
Apr 14 16:49:43 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw systemd[1]: kikimr.service: Failed with result 'signal'.
Apr 14 16:49:44 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw systemd[1]: kikimr.service: Scheduled restart job, restart counter is at 19.
Apr 14 17:04:51 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[13612]: VERIFY failed (2025-04-14T17:04:51.027496Z): tablet_id=72075186251006743;self_id=[50056:7493215939848279549:2660];process=SwitchToWork;verification=id;fline=abstract_scheme.cpp:172;column_name=_ts;schema=id,_yql_plan_step,_yql_tx_id,_yql_write_id,_yql_delete_flag;
Apr 14 17:10:41 vm-cc8mco0j0snqehgh7r2a-ru-central1-a-onkf-yjyw kikimr[15818]: VERIFY failed (2025-04-14T17:10:41.055686Z): self_id=[50056:7493217442183525268:2524];tablet_id=72075186245425751;parent=[50056:7493217442183523702:2280];verification=result;fline=index_info.cpp:655;id=2;indexes=1,4294967040,4294967041,4294967042,4294967043;

@dorooleg dorooleg self-assigned this Apr 14, 2025
@dorooleg
Copy link
Collaborator Author

Падает shared база: /pre-prod_ydb_public/aoedo0ji1lgce9l91har/cc8mco0j0snqehgh7r2a

@dorooleg
Copy link
Collaborator Author

Но реальное падение в sls базе: /pre-prod_ydb_public/yc.yaem.service-cloud/cc8to96e3k5226d9hfdo/compliancenica/services_problems

@dorooleg
Copy link
Collaborator Author

ttl в ней настроен на колонке _ts:


[-]Enabled: { 2 items
--
ColumnName: _ts,
ExpireAfterSeconds: 604800
},
Version: 3
},


@dorooleg
Copy link
Collaborator Author

Cхема таблицы:
Image

@dorooleg
Copy link
Collaborator Author

Проверил на тестовой туле 25-1-1-3 что VersionedIndex строится правильно по данным записанным в локальную базу на ydb-stable-24-4-4-hotfix-2

@dorooleg
Copy link
Collaborator Author

@aavdonkin
Можешь попробовать провернуть следующее:

  1. Поднять ydbd на версии ydb-stable-24-4-4-hotfix-2
  2. Создать там таблицу. Сделать несколько alter и ttl для удаления данных. При этом в произвольном порядке и в каждую версию записывать данных
  3. Запустить на ydb-stable-25-1-1-3 новую версию и проверить что select работают

@dorooleg dorooleg assigned aavdonkin and unassigned dorooleg Apr 17, 2025
@aavdonkin
Copy link
Contributor

aavdonkin commented Apr 21, 2025

После очередной выкатки нового stable падать стало в другом месте:

#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f4ecf08d859 in __GI_abort () at abort.c:79
#2 0x000055c8201a7df1 in NPrivate::InternalPanicImpl (line=line@entry=30, function=function@entry=0x55c80ea9505a "TSpecialKeys", expr=expr@entry=0x55c80e4f20e9 "Data", file=...,
errorMessage=0x55c8105eda29 <NULL_STRING_REPR+9> "", errorMessageSize=0) at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/util/system/yassert.cpp:90
#3 0x000055c8201a0e5c in NPrivate::Panic (file=..., line=line@entry=30, function=0x55c80ea9505a "TSpecialKeys", expr=0x55c80e4f20e9 "Data", format=0x55c80e4effcb " ")
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/util/system/yassert.cpp:55
#4 0x000055c828b912a9 in NKikimr::NArrow::TSpecialKeys::TSpecialKeys (this=this@entry=0x7f4ebd611460, data=..., schema=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/formats/arrow/special_keys.h:30
#5 0x000055c828b8fe7a in NKikimr::NArrow::TFirstLastSpecialKeys::TFirstLastSpecialKeys (this=0x7f4ebd611460, data=..., schema=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/formats/arrow/special_keys.h:65
#6 NKikimr::NOlap::TPortionMetaConstructor::LoadMetadata (this=0x7f4ebd6116a8, portionMeta=..., indexInfo=..., groupSelector=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/portions/constructor_meta.cpp:102
#7 0x000055c82a96e4d3 in NKikimr::NOlap::NLoading::TGranuleOnlyPortionsReader::DoExecute(NKikimr::NTabletFlatExecutor::TTransactionContext&, NActors::TActorContext const&)::$_0::operator()(NKikimr::NOlap::TPortionInfoConstructor&&, NKikimrTxColumnShard::TIndexPortionMeta const&) const (this=0x7f4ebd611858, portion=..., metaProto=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/storage/granule/stages.cpp:14
#8 std::__y1::__invoke[abi:ne190000]<NKikimr::NOlap::NLoading::TGranuleOnlyPortionsReader::DoExecute(NKikimr::NTabletFlatExecutor::TTransactionContext&, NActors::TActorContext const&)::$_0&, NKikimr::NOlap::TPortionInfoConstructor, NKikimrTxColumnShard::TIndexPortionMeta const&> (__f=..., __args=..., __args=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__type_traits/invoke.h:150
#9 std::__y1::__invoke_void_return_wrapper<void, true>::__call[abi:ne190000]<NKikimr::NOlap::NLoading::TGranuleOnlyPortionsReader::DoExecute(NKikimr::NTabletFlatExecutor::TTransactionContext&, NActors::TActorContext const&)::$_0&, NKikimr::NOlap::TPortionInfoConstructor, NKikimrTxColumnShard::TIndexPortionMeta const&>(NKikimr::NOlap::NLoading::TGranuleOnlyPortionsReader::DoExecute(NKikimr::NTabletFlatExecutor::TTransactionContext&, NActors::TActorContext const&)::$_0&, NKikimr::NOlap::TPortionInfoConstructor&&, NKikimrTxColumnShard::TIndexPortionMeta const&) (__args=..., __args=..., __args=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__type_traits/invoke.h:225
#10 std::__y1::__function::__alloc_func<NKikimr::NOlap::NLoading::TGranuleOnlyPortionsReader::DoExecute(NKikimr::NTabletFlatExecutor::TTransactionContext&, NActors::TActorContext const&)::$_0, std::__y1::allocator<NKikimr::NOlap::NLoading::TGranuleOnlyPortionsReader::DoExecute(NKikimr::NTabletFlatExecutor::TTransactionContext&, NActors::TActorContext const&)::$_0>, void (NKikimr::NOlap::TPortionInfoConstructor&&, NKikimrTxColumnShard::TIndexPortionMeta const&)>::operator()[abi:ne190000](NKikimr::NOlap::TPortionInfoConstructor&&, NKikimrTxColumnShard::TIndexPortionMeta const&) (
this=0x7f4ebd611858, __arg=..., __arg=...) at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:169
#11 std::__y1::__function::__func<NKikimr::NOlap::NLoading::TGranuleOnlyPortionsReader::DoExecute(NKikimr::NTabletFlatExecutor::TTransactionContext&, NActors::TActorContext const&)::$_0, std::__y1::allocator<NKikimr::NOlap::NLoading::TGranuleOnlyPortionsReader::DoExecute(NKikimr::NTabletFlatExecutor::TTransactionContext&, NActors::TActorContext const&)::$_0>, void (NKikimr::NOlap::TPortionInfoConstructor&&, NKikimrTxColumnShard::TIndexPortionMeta const&)>::operator()(NKikimr::NOlap::TPortionInfoConstructor&&, NKikimrTxColumnShard::TIndexPortionMeta const&) (this=0x7f4ebd611850, __arg=...,
--Type for more, q to quit, c to continue without paging--
__arg=...) at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:311
#12 0x000055c82ab6ffd4 in std::__y1::__function::__value_func<void (NKikimr::NOlap::TPortionInfoConstructor&&, NKikimrTxColumnShard::TIndexPortionMeta const&)>::operator()[abi:ne190000](NKikimr::NOlap::TPortionInfoConstructor&&, NKikimrTxColumnShard::TIndexPortionMeta const&) const (this=0x7f4ebd611850, __args=..., __args=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:428
#13 std::__y1::function<void (NKikimr::NOlap::TPortionInfoConstructor&&, NKikimrTxColumnShard::TIndexPortionMeta const&)>::operator()(NKikimr::NOlap::TPortionInfoConstructor&&, NKikimrTxColumnShard::TIndexPortionMeta const&) const (this=0x7f4ebd611850, __arg=..., __arg=...) at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:987
#14 NKikimr::NOlap::TDbWrapper::LoadPortions(std::__y1::optional, std::__y1::function<void (NKikimr::NOlap::TPortionInfoConstructor&&, NKikimrTxColumnShard::TIndexPortionMeta const&)> const&)::$_0::operator()<NKikimr::NIceDb::Schema::Table<272u>::Operations::Rowset<NKikimr::NColumnShard::Schema::IndexPortions, NKikimr::NIceDb::Schema::Table<272u>::Operations::EqualPartialKeyIterator<NKikimr::NTable::TTableIter, NKikimr::NColumnShard::Schema::IndexPortions, std::__y1::tuple >, NKikimr::NIceDb::Schema::Table<272u>::TableColumns<NKikimr::NColumnShard::Schema::IndexPortions::PathId, NKikimr::NColumnShard::Schema::IndexPortions::PortionId, NKikimr::NColumnShard::Schema::IndexPortions::SchemaVersion, NKikimr::NColumnShard::Schema::IndexPortions::XPlanStep, NKikimr::NColumnShard::Schema::IndexPortions::XTxId, NKikimr::NColumnShard::Schema::IndexPortions::Metadata, NKikimr::NColumnShard::Schema::IndexPortions::ShardingVersion, NKikimr::NColumnShard::Schema::IndexPortions::MinSnapshotPlanStep, NKikimr::NColumnShard::Schema::IndexPortions::MinSnapshotTxId, NKikimr::NColumnShard::Schema::IndexPortions::CommitPlanStep, NKikimr::NColumnShard::Schema::IndexPortions::CommitTxId, NKikimr::NColumnShard::Schema::IndexPortions::InsertWriteId> > >(NKikimr::NIceDb::Schema::Table<272u>::Operations::Rowset<NKikimr::NColumnShard::Schema::IndexPortions, NKikimr::NIceDb::Schema::Table<272u>::Operations::EqualPartialKeyIterator<NKikimr::NTable::TTableIter, NKikimr::NColumnShard::Schema::IndexPortions, std::__y1::tuple >, NKikimr::NIceDb::Schema::Table<272u>::TableColumns<NKikimr::NColumnShard::Schema::IndexPortions::PathId, NKikimr::NColumnShard::Schema::IndexPortions::PortionId, NKikimr::NColumnShard::Schema::IndexPortions::SchemaVersion, NKikimr::NColumnShard::Schema::IndexPortions::XPlanStep, NKikimr::NColumnShard::Schema::IndexPortions::XTxId, NKikimr::NColumnShard::Schema::IndexPortions::Metadata, NKikimr::NColumnShard::Schema::IndexPortions::ShardingVersion, NKikimr::NColumnShard::Schema::IndexPortions::MinSnapshotPlanStep, NKikimr::NColumnShard::Schema::IndexPortions::MinSnapshotTxId, NKikimr::NColumnShard::Schema::IndexPortions::CommitPlanStep, NKikimr::NColumnShard::Schema::IndexPortions::CommitTxId, NKikimr::NColumnShard::Schema::IndexPortions::InsertWriteId> >&) const (rowset=..., this=)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/db_wrapper.cpp:186
#15 NKikimr::NOlap::TDbWrapper::LoadPortions(std::__y1::optional, std::__y1::function<void (NKikimr::NOlap::TPortionInfoConstructor&&, NKikimrTxColumnShard::TIndexPortionMeta const&)> const&) (this=, pathId=..., callback=...) at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/db_wrapper.cpp:196
#16 0x000055c82a96b396 in NKikimr::NOlap::NLoading::TGranuleOnlyPortionsReader::DoExecute (this=0x54826efea318, txc=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/storage/granule/stages.cpp:12
#17 0x000055c82ab46a72 in NKikimr::ITxReader::Execute (this=0x54826efea318, txc=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/tx_reader/abstract.cpp:28
#18 0x000055c82a8c8620 in NKikimr::TTxCompositeReader::DoExecute (this=, txc=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/tx_reader/composite.h:17
#19 0x000055c82ab46a72 in NKikimr::ITxReader::Execute (this=0x54826efea4f8, txc=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/tx_reader/abstract.cpp:28
--Type for more, q to quit, c to continue without paging--
#20 0x000055c82a8c8620 in NKikimr::TTxCompositeReader::DoExecute (this=, txc=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/tx_reader/composite.h:17
#21 0x000055c82ab46a72 in NKikimr::ITxReader::Execute (this=0x54826efea228, txc=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/tx_reader/abstract.cpp:28
#22 0x000055c82a8c8620 in NKikimr::TTxCompositeReader::DoExecute (this=, txc=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/tx_reader/composite.h:17
#23 0x000055c82ab46a72 in NKikimr::ITxReader::Execute (this=0x54826efe4828, txc=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/tx_reader/abstract.cpp:28
#24 0x000055c82a8c8620 in NKikimr::TTxCompositeReader::DoExecute (this=, txc=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/tx_reader/composite.h:17
#25 0x000055c82ab46a72 in NKikimr::ITxReader::Execute (this=0x54826efdd988, txc=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/tx_reader/abstract.cpp:28
#26 0x000055c82aca3ed4 in NKikimr::NColumnShard::TTxInit::Execute (this=0x548267246780, txc=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/columnshard__init.cpp:86
#27 0x000055c82348bf62 in NKikimr::NTabletFlatExecutor::TExecutor::ExecuteTransaction (this=0x548268c6b980, seat=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor.cpp:1765
#28 0x000055c8234982ca in NKikimr::NTabletFlatExecutor::TExecutor::Handle (this=this@entry=0x548268c6b980, ev=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor.cpp:2696
#29 0x000055c823479b42 in NKikimr::NTabletFlatExecutor::TExecutor::StateWork (this=0x548268c6b980, ev=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor.cpp:3970
#30 0x000055c8208656d7 in NActors::TExecutorThread::Execute (this=this@entry=0x54827f664000, mailbox=mailbox@entry=0x548267bedd80, isTailExecution=)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:269
#31 0x000055c8208696b6 in NActors::TExecutorThread::ProcessExecutorPool()::$_0::operator()(NActors::TMailbox*, bool) const (this=this@entry=0x7f4ebd6125b0, mailbox=mailbox@entry=0x548267bedd80,
isTailExecution=false) at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:460
#32 0x000055c820869291 in NActors::TExecutorThread::ProcessExecutorPool (this=this@entry=0x54827f664000)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:512
#33 0x000055c820869e90 in NActors::TExecutorThread::ThreadProc (this=0x54827f664000) at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:538
#34 0x000055c8201ace08 in (anonymous namespace)::TPosixThread::ThreadProxy (arg=0x54827d39f650) at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/util/system/thread.cpp:244
#35 0x00007f4ecf265609 in start_thread (arg=) at pthread_create.c:477

@aavdonkin
Copy link
Contributor

Не проходит Validate тут https://a.yandex-team.ru/arcadia/contrib/libs/apache/arrow/cpp/src/arrow/record_batch.cc?rev=r16401619#L161
В схеме 5 полей, а колонок 6

@aavdonkin
Copy link
Contributor

Информация о тенанте:
(gdb) p Self->TabletInfo.T_
$5 = (NKikimr::TTabletStorageInfo ) 0x458e33143250
(gdb) p Self->TabletInfo.T_[0]
$6 = { = {<TRefCounted<TThrRefBase, TAtomicCounter, TDelete>> = {Counter_ = {
Counter_ = {<std::__y1::__atomic_base<long, true>> = {<std::__y1::__atomic_base<long, false>> = {
_a = {<std::__y1::__cxx_atomic_base_impl> = {__a_value = 5}, },
static is_always_lock_free = }, }, }}},
_vptr$TThrRefBase = 0x562c29c4d888 <vtable for NKikimr::TTabletStorageInfo+16>}, TabletID = 72075186251006816,
Channels = {<std::__y1::vector<NKikimr::TTabletChannelInfo, std::__y1::allocatorNKikimr::TTabletChannelInfo >> = {_begin = 0x458e2dc40c00,
_end = 0x458e2dc40c78, _end_cap = {<std::__y1::__compressed_pair_elem<NKikimr::TTabletChannelInfo
, 0, false>> = {
_value = 0x458e2dc40c78}, <std::__y1::__compressed_pair_elem<std::__y1::allocatorNKikimr::TTabletChannelInfo, 1, true>> = {<std::__y1::allocatorNKikimr::TTabletChannelInfo> = {<std::__y1::__non_trivial_if<true, std::__y1::allocatorNKikimr::TTabletChannelInfo >> = {}, }, }, }}, }, TabletType = NKikimrTabletBase::TTabletTypes_EType_ColumnShard,
Version = 1, TenantPathId = {OwnerId = 72057594046678944, LocalPathId = 41268}, HiveId = 72075186244834444}

@aavdonkin
Copy link
Contributor

Информация о таблице:
(gdb) fr 16
#16 0x0000562c1e963396 in NKikimr::NOlap::NLoading::TGranuleOnlyPortionsReader::DoExecute (this=0x458e38c67528, txc=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/storage/granule/stages.cpp:12
12 /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/storage/granule/stages.cpp: No such file or directory.
(gdb) p Self
$11 = (NKikimr::NOlap::TGranuleMeta *) 0x458e33057218

(gdb) p Self->PathId
$13 = 11

@swalrus1
Copy link
Collaborator

Ошибка из логов:

Apr 22 13:41:38 vm-cc8mco0j0snqehgh7r2a-ru-central1-b-onkf-edis KIKIMR[656323]: 2025-04-22T13:41:38.213839Z :ARROW_HELPER ERROR: tablet_id=72075186251006849;event=initialize_shard;load_stage_name=EXECUTE:composite_init;load_stage_name=EXECUTE:column_engines;load_stage_name=EXECUTE:column_engines/granules;load_stage_name=EXECUTE:granules/granule;load_stage_name=EXECUTE:granule/portions;fline=arrow_helpers.cpp:142;event=cannot_parse;message=Serialization error: batch is not valid: Invalid: Buffer #1 too small in array of type int64 and length 2: expected at least 16 byte(s), got 0;schema_columns_count=5;schema_columns=EventTime,CounterID,EventDate,UserID,WatchID;

@swalrus1 swalrus1 marked this as a duplicate of #17564 Apr 23, 2025
@aavdonkin
Copy link
Contributor

Выяснил что падения происходят из-за неправильной схемы, а именно типов utf8 вместо целочисленных в TIndexInfo::PrimaryKey.
Стек выставления неправильного типа:
(gdb) bt
#0 NKikimr::NOlap::TIndexInfo::SetAllKeys (this=this@entry=0x7f42fd8082a0, operators=..., columns=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/scheme/index_info.cpp:119
#1 0x0000557e1343519a in NKikimr::NOlap::TIndexInfo::DeserializeFromProto (this=this@entry=0x7f42fd8082a0, schema=..., operators=..., cache=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/scheme/index_info.cpp:284
#2 0x0000557e1343860d in NKikimr::NOlap::TIndexInfo::BuildFromProto (schema=..., operators=..., cache=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/scheme/index_info.cpp:331
#3 0x0000557e1543cae3 in NKikimr::NOlap::TColumnEngineForLogs::RegisterSchemaVersion (this=0x4d7377be400, snapshot=..., presetId=0, schema=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/column_engine_logs.cpp:102
#4 0x0000557e1543afc7 in NKikimr::NOlap::TColumnEngineForLogs::TColumnEngineForLogs (this=0x4d7377be400, tabletId=0, schemaCache=..., dataAccessorsManager=...,
storagesManager=..., snapshot=..., presetId=0, schema=..., counters=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/column_engine_logs.cpp:41
#5 0x0000557e15653257 in std::__y1::make_unique[abi:ne190000]<NKikimr::NOlap::TColumnEngineForLogs, unsigned long&, std::__y1::shared_ptrNKikimr::NOlap::TSchemaObjectsCache&, std::__y1::shared_ptrNKikimr::NOlap::NDataAccessorControl::IDataAccessorsManager&, std::__y1::shared_ptrNKikimr::NOlap::IStoragesManager&, NKikimr::NOlap::TSnapshot&, unsigned int&, NKikimr::NOlap::IColumnEngine::TSchemaInitializationData&, std::__y1::shared_ptrNKikimr::NColumnShard::TPortionIndexStats&>(unsigned long&, std::__y1::shared_ptrNKikimr::NOlap::TSchemaObjectsCache&, std::__y1::shared_ptrNKikimr::NOlap::NDataAccessorControl::IDataAccessorsManager&, std::__y1::shared_ptrNKikimr::NOlap::IStoragesManager&, NKikimr::NOlap::TSnapshot&, unsigned int&, NKikimr::NOlap::IColumnEngine::TSchemaInitializationData&, std::__y1::shared_ptrNKikimr::NColumnShard::TPortionIndexStats&) (__args=...,
__args=..., __args=..., __args=..., __args=..., __args=..., __args=..., __args=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/unique_ptr.h:621
#6 NKikimr::NColumnShard::TTablesManager::InitFromDB (this=0x7f42fd808b40, db=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/tables_manager.cpp:166
#7 0x0000557e0a398ab0 in NKikimr::NOlap::NRestorePortionsFromChunks::TNormalizer::DoInit (this=0x4d72ca72730, controller=..., txc=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/normalizer/portion/restore_portion_from_chunks.cpp:77
#8 0x0000557e15487c6c in NKikimr::NOlap::TNormalizationController::INormalizerComponent::Init (this=0x4d72ca72730, controller=..., txc=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/normalizer/abstract/abstract.cpp:156
#9 0x0000557e15598985 in NKikimr::NColumnShard::TTxUpdateSchema::Execute (this=0x4d72b6f7340, txc=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/columnshard__init.cpp:136
#10 0x0000557e0dd7ff62 in NKikimr::NTabletFlatExecutor::TExecutor::ExecuteTransaction (this=0x4d731e11280, seat=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor.cpp:1765
#11 0x0000557e0dd8c2ca in NKikimr::NTabletFlatExecutor::TExecutor::Handle (this=this@entry=0x4d731e11280, ev=..., ctx=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor.cpp:2696
#12 0x0000557e0dd6db42 in NKikimr::NTabletFlatExecutor::TExecutor::StateWork (this=0x4d731e11280, ev=...)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor.cpp:3970
#13 0x0000557e0b1596d7 in NActors::TExecutorThread::Execute (this=this@entry=0x4d73f453800, mailbox=mailbox@entry=0x4d73281e1c0, isTailExecution=)
at /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:269

@zverevgeny zverevgeny assigned swalrus1 and unassigned aavdonkin Apr 28, 2025
@swalrus1 swalrus1 linked a pull request Apr 28, 2025 that will close this issue
@zverevgeny
Copy link
Collaborator

Был клеш в кеше схемы на коммуналках, из-за того что в ключе не было базы, и таблицы с одинаковыми pathId в разных базах попадали под один ключ в кеше

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants