Skip to content

Commit 233dfea

Browse files
Nelson Goncalvesakojima
Nelson Goncalves
authored andcommitted
WL#12773 BUG#29629121 BUG#29867008 BUG#29559303 BUG#29868204 BUG#29869023 BUG#29862474 Simplification of internal recovery accounts
This simplifies the number of recovery accounts that are created. After this patch a only a single recovery account is created for each member of the cluster, even if the ipWhitelist option is used. Furthermore, the recovery user with the "localhost" host part is no longer created (only '%'). This can lead to an issue in local experimental setups using "localhost" for the report_host. It is a known limitation, but not considered a problem because "localhost" is not expected to be used (especially not in a real production settings). This patch also deprecates the superReadOnly option for the createCluster and rebootClusterFromCompleteOutage commands. The super-read-only mode is now automatically disabled. Existing tests were modified and updated according to the new number and name format of the recovery accounts. No new tests were added because existing tests already covered the test plan proposed by QA. BUG#29629121: track created users, do not use pattern to guess when cleaning up This bug was fixed by using the information about the recovery users stored in the metadata or in the mysql.slave_master_info table when dropping the recovery user. Users are no longer dropped according to a regular expression. BUG#29867008: removeinstance(), rejoininstance() shouldn't drop all accounts This bug was only partially fixed in the context of the WL. This wl fixed it by dropping the recovery account that belongs to the instance removed instance at the primary instance and then waiting for that information to be replicated to the instance being removed before actually removing the instance from the cluster. BUG#29559303: cluster.dissolve() fails This bug was fixed by handling both recovery account formats used by the shell: the new name format defined by this worklog 'mysql_innodb_cluster_<server_id>' and the old account format 'mysql_innodb_cluster_r<random_number>' when dropping a recovery user of an instance. This is done in both cluster.removeInstance and cluster.dissolve. BUG#29868204: cluster.addinstance() fails complaining about read-only mode This bug was a symptom of a new check that was added during the context of this worklog but has since been removed. The check needed to create a new user and that check was being done before the SRO was disabled by the addInstance operation. Since the check is no longer made, nothing really needed to be done to fix this issue. BUG#29869023: cluster.rejoininstance() fails: instances cannot communicate with each other This bug was a symptom of a new check that was added during the context of this worklog but has since been removed. The check executed a "change master" that was not correctly handling cases where the servers were using ssl. This issue has since been fixed even tough the check is not currently used. BUG#29862474: dba.checkinstanceconfiguration fails when --super-read-only is on This bug was a symptom of a new check that was added to dba.checkInstanceConfiguration() during the context of this worklog but has since been removed. The new check would not work if the target instance had SRO enabled. This was a new requrement, as previously none of the checks made by dba.checkInstanceConfiguration() would require SRO to be disabled and that was the cause of the issue. Since the check has since been removed, there was nothing that needed to be done to fix this issue. Change-Id: Ib567e3f2c7a3bd23c515b04d91354fb1e6352cf9
1 parent 18a2f6f commit 233dfea

File tree

90 files changed

+1295
-1776
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

90 files changed

+1295
-1776
lines changed

modules/adminapi/cluster/cluster_impl.cc

+96
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@
4545
#include "mysqlshdk/include/shellcore/console.h"
4646
#include "mysqlshdk/libs/mysql/group_replication.h"
4747
#include "mysqlshdk/libs/mysql/replication.h"
48+
#include "mysqlshdk/libs/mysql/utils.h"
4849
#include "shellcore/utils_help.h"
4950
#include "utils/debug.h"
5051
#include "utils/utils_general.h"
@@ -378,5 +379,100 @@ mysqlshdk::utils::Version Cluster_impl::get_lowest_instance_version() const {
378379
return lowest_version;
379380
}
380381

382+
mysqlshdk::mysql::Auth_options Cluster_impl::create_replication_user(
383+
mysqlshdk::mysql::IInstance *target) {
384+
assert(target);
385+
mysqlshdk::mysql::Instance primary_master(get_group_session());
386+
std::string recovery_user =
387+
mysqlshdk::gr::k_group_recovery_user_prefix +
388+
std::to_string(target->get_sysvar_int("server_id").get_safe());
389+
390+
log_info("Creating recovery account '%s'@'%%' for instance '%s'",
391+
recovery_user.c_str(), target->descr().c_str());
392+
393+
// Check if the replication user already exists and delete it if it does,
394+
// before creating it again.
395+
bool rpl_user_exists = primary_master.user_exists(recovery_user, "%");
396+
if (rpl_user_exists) {
397+
auto console = current_console();
398+
console->print_warning(shcore::str_format(
399+
"User '%s'@'%%' already existed at instance '%s'. It will be "
400+
"deleted and created again with new credentials.",
401+
recovery_user.c_str(), primary_master.descr().c_str()));
402+
primary_master.drop_user(recovery_user, "%");
403+
}
404+
return mysqlshdk::gr::create_recovery_user(recovery_user, &primary_master,
405+
{"%"}, {});
406+
}
407+
408+
void Cluster_impl::drop_replication_user(mysqlshdk::mysql::IInstance *target) {
409+
assert(target);
410+
mysqlshdk::mysql::Instance primary_master(get_group_session());
411+
auto console = current_console();
412+
413+
std::string recovery_user, recovery_user_host;
414+
std::tie(recovery_user, recovery_user_host) =
415+
get_metadata_storage()->get_instance_recovery_account(target->get_uuid());
416+
if (recovery_user.empty()) {
417+
// TODO(Miguel): WL13208: handle cases where the replication user was stolen
418+
// from donor during clone
419+
420+
// Assuming the account was created by an older version of the shell,
421+
// which did not store account name in metadata
422+
log_info(
423+
"No recovery account details in metadata for instance '%s', assuming "
424+
"old account style",
425+
target->descr().c_str());
426+
427+
// Get replication user (recovery) used by the instance to remove
428+
// on remaining members.
429+
recovery_user = mysqlshdk::gr::get_recovery_user(*target);
430+
431+
// Remove the replication user used by the removed instance on all
432+
// cluster members through the primary (using replication).
433+
// NOTE: Make sure to remove the user if it was an user created by
434+
// the Shell, i.e. with the format: mysql_innodb_cluster_r[0-9]{10}
435+
if (shcore::str_beginswith(
436+
recovery_user, mysqlshdk::gr::k_group_recovery_old_user_prefix)) {
437+
log_info("Removing replication user '%s'", recovery_user.c_str());
438+
try {
439+
mysqlshdk::mysql::drop_all_accounts_for_user(get_group_session(),
440+
recovery_user);
441+
} catch (const std::exception &e) {
442+
console->print_warning("Error dropping recovery accounts for user " +
443+
recovery_user + ": " + e.what());
444+
}
445+
} else {
446+
console->print_note("The recovery user name for instance '" +
447+
target->descr() +
448+
"' does not match the expected format for users "
449+
"created automatically "
450+
"by InnoDB Cluster. Skipping its removal.");
451+
}
452+
} else {
453+
log_info("Dropping recovery account '%s'@'%s' for instance '%s'",
454+
recovery_user.c_str(), recovery_user_host.c_str(),
455+
target->descr().c_str());
456+
457+
try {
458+
primary_master.drop_user(recovery_user, "%", true);
459+
} catch (const std::exception &e) {
460+
console->print_warning(shcore::str_format(
461+
"Error dropping recovery account '%s'@'%s' for instance '%s'",
462+
recovery_user.c_str(), recovery_user_host.c_str(),
463+
target->descr().c_str()));
464+
}
465+
466+
// Also update metadata
467+
try {
468+
get_metadata_storage()->update_instance_recovery_account(
469+
target->get_uuid(), "", "");
470+
} catch (const std::exception &e) {
471+
log_warning("Could not update recovery account metadata for '%s': %s",
472+
target->descr().c_str(), e.what());
473+
}
474+
}
475+
}
476+
381477
} // namespace dba
382478
} // namespace mysqlsh

modules/adminapi/cluster/cluster_impl.h

+5-3
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,6 @@
3737
#include "mysqlshdk/libs/db/connection_options.h"
3838
#include "mysqlshdk/libs/innodbcluster/cluster_metadata.h"
3939

40-
#define ACC_USER "username"
41-
#define ACC_PASSWORD "password"
42-
4340
#define ATT_DEFAULT "default"
4441

4542
namespace mysqlsh {
@@ -98,6 +95,11 @@ class Cluster_impl {
9895
return _group_session;
9996
}
10097

98+
mysqlshdk::mysql::Auth_options create_replication_user(
99+
mysqlshdk::mysql::IInstance *target);
100+
101+
void drop_replication_user(mysqlshdk::mysql::IInstance *target);
102+
101103
void disconnect();
102104

103105
public:

modules/adminapi/cluster/dissolve.cc

+13-9
Original file line numberDiff line numberDiff line change
@@ -329,9 +329,6 @@ void Dissolve::remove_instance(const std::string &instance_address,
329329
// Stop Group Replication and reset (persist) GR variables.
330330
mysqlsh::dba::leave_replicaset(*m_available_instances[instance_index]);
331331

332-
// Remove existing replications users.
333-
m_cluster->get_default_replicaset()->remove_replication_users(
334-
*m_available_instances[instance_index], false);
335332
} catch (const std::exception &err) {
336333
auto console = mysqlsh::current_console();
337334

@@ -345,7 +342,7 @@ void Dissolve::remove_instance(const std::string &instance_address,
345342
log_error("Instance '%s' failed to leave the ReplicaSet: %s",
346343
instance_address.c_str(), err.what());
347344
console->print_warning(
348-
"An error occured when trying to remove instance '" +
345+
"An error occurred when trying to remove instance '" +
349346
instance_address +
350347
"' from the cluster. The instance might have been left active "
351348
"and in an inconsistent state, requiring manual action to "
@@ -356,13 +353,20 @@ void Dissolve::remove_instance(const std::string &instance_address,
356353
}
357354

358355
shcore::Value Dissolve::execute() {
359-
// Drop the cluster's metadata.
360356
std::shared_ptr<MetadataStorage> metadata = m_cluster->get_metadata_storage();
357+
auto console = mysqlsh::current_console();
358+
359+
// JOB: Remove replication accounts used for recovery of GR.
360+
// Note: This operation MUST be performed before leave-replicaset to ensure
361+
// that all changed are propagated to the online instances.
362+
for (auto &instance : m_available_instances) {
363+
m_cluster->drop_replication_user(instance.get());
364+
}
365+
366+
// Drop the cluster's metadata.
361367
std::string cluster_name = m_cluster->get_name();
362368
metadata->drop_cluster(cluster_name);
363369

364-
auto console = mysqlsh::current_console();
365-
366370
// We must stop GR on the online instances only, otherwise we'll
367371
// get connection failures to the (MISSING) instances
368372
// BUG#26001653.
@@ -406,7 +410,7 @@ shcore::Value Dissolve::execute() {
406410
"%s",
407411
instance_address.c_str(), err.what());
408412
console->print_warning(
409-
"An error occured when trying to catch up with cluster "
413+
"An error occurred when trying to catch up with cluster "
410414
"transactions and instance '" +
411415
instance_address +
412416
"' might have been left in an inconsistent state that will lead "
@@ -480,7 +484,7 @@ shcore::Value Dissolve::execute() {
480484
warning_msg.append("in order to be able to be reused.");
481485
console->print_warning(warning_msg);
482486
} else {
483-
// Full cluster sucessfuly removed.
487+
// Full cluster successfully removed.
484488
console->println("The cluster was successfully dissolved.");
485489
console->println("Replication was disabled but user data was left intact.");
486490
console->println();

modules/adminapi/common/common.cc

-43
Original file line numberDiff line numberDiff line change
@@ -103,26 +103,6 @@ std::string get_mysqlprovision_error_string(
103103
return shcore::str_join(str_errors, "\n");
104104
}
105105

106-
std::vector<std::string> convert_ipwhitelist_to_netmask(
107-
const std::vector<std::string> &ip_whitelist) {
108-
std::vector<std::string> ret;
109-
110-
for (const std::string &value : ip_whitelist) {
111-
// Strip any blank chars from the ip_whitelist value and
112-
// Translate CIDR to netmask notation
113-
ret.push_back(
114-
mysqlshdk::utils::Net::cidr_to_netmask(shcore::str_strip(value)));
115-
}
116-
117-
return ret;
118-
}
119-
120-
std::vector<std::string> convert_ipwhitelist_to_netmask(
121-
const std::string &ip_whitelist) {
122-
return convert_ipwhitelist_to_netmask(
123-
shcore::str_split(shcore::str_strip(ip_whitelist), ",", -1));
124-
}
125-
126106
bool is_group_replication_option_supported(
127107
const mysqlshdk::utils::Version &version, const std::string &option,
128108
const std::map<std::string, Option_availability> &options_map) {
@@ -1017,29 +997,10 @@ bool validate_super_read_only(const mysqlshdk::mysql::IInstance &instance,
1017997
"For more information see: https://dev.mysql.com/doc/refman/en/"
1018998
"server-system-variables.html#sysvar_super_read_only.");
1019999

1020-
auto show_open_sessions = [&](const std::string &message) {
1021-
// Get the list of open session to the instance
1022-
std::vector<std::pair<std::string, int>> open_sessions =
1023-
get_open_sessions(instance.get_session());
1024-
if (!open_sessions.empty()) {
1025-
console->print_note(message);
1026-
for (const auto &value : open_sessions) {
1027-
console->print_info(std::to_string(value.second) +
1028-
" open session(s) of '" + value.first + "'. \n");
1029-
}
1030-
}
1031-
};
1032-
10331000
if (clear_read_only.is_null() && interactive) {
10341001
console->print_info(error_message);
10351002
console->print_info();
10361003

1037-
show_open_sessions(
1038-
"There are open sessions to '" + instance.descr() +
1039-
"'.\n"
1040-
"You may want to kill these sessions to prevent them from "
1041-
"performing unexpected updates: \n");
1042-
10431004
if (console->confirm(
10441005
"Do you want to disable super_read_only and continue?",
10451006
mysqlsh::Prompt_answer::NO) == mysqlsh::Prompt_answer::NO) {
@@ -1060,10 +1021,6 @@ bool validate_super_read_only(const mysqlshdk::mysql::IInstance &instance,
10601021
} else {
10611022
console->print_error(error_message);
10621023

1063-
show_open_sessions(
1064-
"If you unset super_read_only you should consider closing the "
1065-
"following: ");
1066-
10671024
throw shcore::Exception::runtime_error("Server in SUPER_READ_ONLY mode");
10681025
}
10691026
}

modules/adminapi/common/common.h

-15
Original file line numberDiff line numberDiff line change
@@ -182,21 +182,6 @@ extern const char *kMemberSSLModeAuto;
182182
extern const char *kMemberSSLModeRequired;
183183
extern const char *kMemberSSLModeDisabled;
184184

185-
/**
186-
* Converts an ipWhitelist to a list of addresses using the netmask notation
187-
*
188-
* @param ip_whitelist The vector of strings containing the ipWhitelist to
189-
* convert
190-
*
191-
* @return A vector of strings containing the ipWhitelist converted to netmask
192-
* notation
193-
*/
194-
std::vector<std::string> convert_ipwhitelist_to_netmask(
195-
const std::vector<std::string> &ip_whitelist);
196-
197-
std::vector<std::string> convert_ipwhitelist_to_netmask(
198-
const std::string &ip_whitelist);
199-
200185
/**
201186
* Check if a group_replication setting is supported on the target
202187
* instance

0 commit comments

Comments
 (0)