solana

Commit Graph

Author	SHA1	Message	Date
Michael Vines	1e0942e900	Rename ClusterInfo::send_vote to ClusterInfo::send_transaction	2021-07-07 15:51:14 -07:00
Justin Starry	92c5cdab62	Fix cargo check (#18499 )	2021-07-07 14:21:08 -05:00
behzad nouri	dba42c57b4	implements an unbiased weighted shuffle using binary indexed tree (#18343 ) Current implementation of weighted_shuffle: https://github.com/solana-labs/solana/blob/b08f8bd1b/gossip/src/weighted_shuffle.rs#L11-L37 uses a heuristic which results in biased samples. For example, if the weights are [1, 10, 100], then the 3rd index should come first 100 times more often than the 1st index. However, weighted_shuffle is picking the 3rd index 200+ times more often than the 1st index, showing a disproportional bias in favor of higher weights. This commit implements weighted shuffle using binary indexed tree to maintain cumulative sum of weights while sampling. The resulting samples are demonstrably unbiased and precisely proportional to the weights. Additionally the iterator interface allows to skip computations when not all indices are processed. Of the use cases of weighted_shuffle, changing turbine code requires feature-gating to keep the cluster in sync. That is not updated in this commit, but can be done together with future updates to turbine.	2021-07-07 14:14:43 +00:00
behzad nouri	04787be8b1	encapsulates turbine peers computations of broadcast & retransmit stages (#18238 ) Broadcast stage and retransmit stage should arrange nodes on turbine broadcast tree in exactly same order. Additionally any changes to this ordering (e.g. updating how unstaked nodes are handled) requires feature gating to keep the cluster in sync. Current implementation is scattered out over several public methods and exposes too much of implementation details (e.g. usize indices into peers vector) which makes code changes and checking for feature activations more difficult. This commit encapsulates turbine peer computations into a new struct, and only exposes two public methods, get_broadcast_peer and get_retransmit_peers, for call-sites.	2021-07-07 00:35:25 +00:00
Michael Vines	c17451ca73	Acquire instance read lock once	2021-07-01 17:50:04 -07:00
Michael Vines	db3a9ae7fb	Fully replace NodeInstance	2021-07-01 17:50:04 -07:00
Michael Vines	71efac46cb	Hoist keypair() out of some loops	2021-07-01 17:50:04 -07:00
Michael Vines	b6792a3328	Add ability to change the validator identity at runtime	2021-07-01 17:50:04 -07:00
Michael Vines	bf157506e8	Remove id ref	2021-07-01 17:50:04 -07:00
Ashwin Sekar	f4fb5de545	Consider all peers as potential candidates during pull-request in case of offline nodes (#18333 ) * Try all peers during pull-request in case of offline nodes * fix clippy err	2021-07-01 12:00:10 -07:00
behzad nouri	9d983a34a0	debug logs when crds table trim failed (#18307 ) reports of this error being possibly spammy: https://discord.com/channels/428295358100013066/689412830075551748/859441080054710293 The commit changes the log level to debug. Additionally adding a new metric to understand the frequency of this error.	2021-06-29 19:39:46 +00:00
behzad nouri	d7b8329b45	removes repeated calls to ClusterInfo::id in iterators and contact-info clone (#18174 ) Calling ClusterInfo::id repeatedly in for loops or iterators is inefficient, because it acquires a lock on ClusterInfo.my_contact_info, and clones the entire contact-info.	2021-06-23 16:30:14 +00:00
behzad nouri	69a5f0e6cd	filters crds values obtained through gossip by their shred version (#18072 ) filter_by_shred_version does not check the shred-version of the owner of the crds-value. It only checks the shred-version of the node which is relaying the value: https://github.com/solana-labs/solana/blob/5cc073420/gossip/src/cluster_info.rs#L2274-L2289 So crds-values with different shred versions can still pass through this function as long as they are relayed by a node with matching shred version; and so, a single node can bridge different shred values through-out the cluster.	2021-06-23 14:16:05 +00:00
Michael Vines	84b9de8c18	Shredder no longer holds a keypair	2021-06-21 21:29:52 -07:00
Michael Vines	553fc210f5	Remove duplicated id field	2021-06-21 21:29:52 -07:00
behzad nouri	598093b5db	adds shred-version to ip-echo-server response When starting a validator, the node initially joins gossip with shred_verison = 0, until it adopts the entrypoint's shred-version: https://github.com/solana-labs/solana/blob/9b182f408/validator/src/main.rs#L417 Depending on the load on the entrypoint, this adopting entrypoint shred-version through gossip sometimes becomes very slow, and causes several problems in gossip because we have to partially support shred_version == 0 which is a source of leaking crds values from one cluster to another. e.g. see https://github.com/solana-labs/solana/pull/17899 and the other linked issues there. In order to remove shred_version == 0 from gossip, this commit adds shred-version to ip-echo-server response. Once the entrypoints are updated, on validator start-up, if --expected_shred_version is not specified we will obtain shred-version from the entrypoint using ip-echo-server.	2021-06-21 19:37:16 +00:00
Alexander Meißner	789f33e8db	chore: cargo fmt	2021-06-18 10:42:46 -07:00
Alexander Meißner	6514096a67	chore: cargo +nightly clippy --fix -Z unstable-options	2021-06-18 10:42:46 -07:00
behzad nouri	5a99fa3790	adds mapping from nodes pubkeys to their shred-version (#17940 ) Crds values of nodes with different shred versions are creeping into gossip table resulting in runtime issues as the one addressed in: https://github.com/solana-labs/solana/pull/17899 This commit works towards enforcing more checks and filtering based on shred version by adding necessary mapping and api to gossip table. Once populated, pubkey->shred-version mapping persists as long as there are any values associated with the pubkey.	2021-06-18 15:56:04 +00:00
sakridge	eeee75c5be	Don't use pinned memory when unnecessary (#17832 ) Reports of excessive GPU memory usage and errors from cudaHostRegister. There are some cases where pinning is not required.	2021-06-14 16:10:04 +02:00
behzad nouri	cca46308bc	short cuts expiration check if origin's contact-info is still valid (#17918 ) Crds::find_old_labels can skip checking values timestamps if the origin's contact info hasn't expired yet: https://github.com/solana-labs/solana/blob/985280ec0/gossip/src/crds.rs#L394-L408	2021-06-13 19:47:07 +00:00
behzad nouri	985280ec0b	excludes epoch-slots from nodes with unknown or different shred version (#17899 ) Inspecting TDS gossip table shows that crds values of nodes with different shred-versions are creeping in. Their epoch-slots are accumulated in ClusterSlots causing bogus slots very far from current root which are not purged and so cause ClusterSlots keep consuming more memory: https://github.com/solana-labs/solana/issues/17789 https://github.com/solana-labs/solana/issues/14366#issuecomment-769896036 https://github.com/solana-labs/solana/issues/14366#issuecomment-832754654 This commit updates ClusterInfo::get_epoch_slots, and discards entries from nodes with unknown or different shred-version. Follow up commits will patch gossip not to waste bandwidth and memory over crds values of nodes with different shred-version.	2021-06-13 14:08:08 +00:00
behzad nouri	cab30e2356	parallelizes gossip packets receiver with processing of requests (#17647 ) Gossip packet processing is composed of two stages: * The first is consuming packets from the socket, deserializing, sanitizing and verifying them: https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2510-L2521 * The second is actually processing the requests/messages: https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2585-L2605 The former does not acquire any locks and so can be parallelized with the later, allowing better pipelineing properties and smaller latency in responding to gossip requests or propagating messages.	2021-06-07 18:36:06 +00:00
behzad nouri	60b0a13444	writes epoch-slots to crds table synchronously (#17719 ) epoch-slots may be overwritten before they are written to crds table: https://github.com/solana-labs/solana/issues/17711 This commit writes new epoch-slots to crds table synchronously with push_epoch_slots. The functions is still not thread-safe as commented in the code, however currently only one threads is invoking this code.	2021-06-04 13:56:51 +00:00
behzad nouri	be957f25c9	adds fallback logic if retransmit multicast fails (#17714 ) In retransmit-stage, based on the packet.meta.seed and resulting children/neighbors, each packet is sent to a different set of peers: https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L421-L457 However, current code errors out as soon as a multicast call fails, which will skip all the remaining packets: https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470 This can exacerbate packets loss in turbine. This commit: * keeps iterating over retransmit packets for loop even if some intermediate sends fail. * adds a fallback to UdpSocket::send_to if multicast fails. Recent discord chat: https://discord.com/channels/428295358100013066/689412830075551748/849530845052403733	2021-06-04 12:16:37 +00:00
Tyera Eulberg	3a647c4bea	Rename ValidatorExit and move to sdk (#17728 )	2021-06-04 03:06:13 +00:00
behzad nouri	7cf6e66ddd	excludes caller's crds values from pull responses (#17542 ) If the crds entry belongs to the caller itself, then the caller will always have the more recent version of it, regardless of it being filtered out by the bloom filter or not. The exception is node-instance types which are meant to detect duplicate running instances, and those are exempted.	2021-05-28 13:19:14 +00:00
Tyera Eulberg	9a5330b7eb	Move gossip modules into solana-gossip crate (#17352 ) * Move gossip modules to solana-gossip * Update Protocol abi digest due to move * Move gossip benches and hook up CI * Remove unneeded Result entries * Single use statements	2021-05-26 09:15:46 -06:00
behzad nouri	cf1acfb021	uses Duration type for gossip discover timeout	2021-05-22 19:17:36 +00:00
Michael Vines	a911ae00ba	clippy	2021-04-18 20:55:02 -07:00
Michael Vines	24ab84936e	Break up RPC API into three categories: minimal, full and admin	2021-03-04 16:39:44 -08:00
Michael Vines	328f59ebef	--gossip-host may now be specified with --entrypoint	2020-11-13 06:20:15 +00:00
Michael Vines	a1e2357d12	`solana-gossip spy` can now be given an identity keypair (`--identity` argument)	2020-08-22 17:00:50 -07:00
sakridge	0c72f62e96	Refactor gossip code from one huge function (#10701 )	2020-06-18 22:20:52 -07:00
sakridge	c4a096d8d4	Wait 15 seconds for gossip rpc url (#10053 )	2020-05-15 13:23:40 -07:00
Jack May	eb1acaf927	Remove archiver and storage program (#9992 ) automerge	2020-05-14 18:22:47 -07:00
Michael Vines	4e4a21f9b7	`solana-gossip spy` can now specify a shred version (#10040 )	2020-05-13 19:37:40 -07:00
Michael Vines	2521f75c18	Advertise node software version in gossip (#9981 ) * Advertise node version in gossip * Remove solana_clap_utils::version! macro	2020-05-11 15:02:01 -07:00
Michael Vines	d1cbccd9ba	solana-dos can now DoS gossip nodes (#9652 ) automerge	2020-04-23 11:46:12 -07:00
Michael Vines	448b957a13	Add --bind-address and --rpc-bind-address validator arguments (#8628 )	2020-03-04 22:46:43 -07:00
Michael Vines	356f246a74	Remove get-/show- prefix from cli commands	2020-01-21 08:43:07 -07:00
Jack May	07855e3125	Allow override of RUST_LOG (#7705 )	2020-01-08 09:19:12 -08:00
Michael Vines	48a36f59a6	Add get-rpc-url --any option	2020-01-02 17:20:59 -07:00
Michael Vines	965b132664	Permit --gossip-host with --entrypoint	2020-01-02 17:20:59 -07:00
Michael Vines	b0271394cd	Clean up --gossip-port argument (#7067 ) --gossip-port now specifies exactly that, the gossip port to use. The new --gossip-host argument can be used to specify the DNS name/IP address for gossip if --entrypoint is not supplied (when --entrypoint is supplied, the gossip address is automatically set to the node's ip address as observed by the entrypoint)	2019-11-20 15:21:34 -07:00
Michael Vines	c3926e6af0	\|solana-gossip spy\| no longer requires an entrypoint (#6999 )	2019-11-16 14:16:28 -07:00
Ryo Onodera	4fc767b3f6	Move version! from core:: to clap_utils:: (#6944 ) * Move version! from core to clap-utils * Completely move version! from core:: to clap_utils:: * rustfmt * Do remaining transition after rebase	2019-11-14 13:10:38 +09:00
Ryo Onodera	3faeb7fa79	Rename solana-netutil to solana-net-utils for consistency (#6895 ) * sed -i -e 's/netutil/net_utils/g' $(git grep --files-with-matches netutil :*.rs) sed -i -e 's/netutil/net-utils/g' $(git grep --files-with-matches netutil) * git mv netutil/ net-utils * Tweak a bit * Fix rustfmt & clippy	2019-11-12 13:37:13 -07:00
Ryo Onodera	d84f367317	Extract duplicate clap helpers into clap-utils (#6812 )	2019-11-12 09:42:08 +09:00
Michael Vines	4be646c695	discover() by gossip sockaddr instead of just by gossip ip address (#6865 )	2019-11-11 12:42:58 -07:00
Michael Vines	0fbd508c5f	Only check the entrypoint's RPC address (#6851 )	2019-11-09 00:56:31 -07:00
Michael Vines	397ea05aa7	spy nodes are now gossip entrypoints (#6532 )	2019-10-24 15:35:33 -07:00
Michael Vines	8e5e48dd92	Add get-rpc-url --all flag (#6533 )	2019-10-24 10:44:05 -07:00
Greg Fitzgerald	9232057e95	Rename replicator to archiver (#6464 ) * Rename replicator to archiver * cargo fmt * Fix grammar	2019-10-21 11:29:37 -06:00
Greg Fitzgerald	322fcea6e5	More fullnode to validator renaming (#6337 )	2019-10-11 13:30:52 -06:00
Michael Vines	f9f5bc2eb5	More clippy	2019-10-02 21:21:07 -07:00
Michael Vines	8e888059d8	Use built-in solana-gossip timeout for better error messages (#6189 )	2019-10-01 12:30:11 -07:00
Michael Vines	e3a6c9234a	Entrypoint RPC service discovery now blocks until the entrypoint is actually found (#5756 ) automerge	2019-08-30 16:12:58 -07:00
Michael Vines	22667d64d1	Add various missing cli validators (#5745 ) automerge	2019-08-30 09:27:35 -07:00
Michael Vines	3450b9a44d	Rename solana to solana-core (#5583 )	2019-08-21 10:23:33 -07:00
Michael Vines	08f6a2ea3e	debash: Add `solana-gossip get-rpc-url` command to avoid hard coding (#5513 )	2019-08-13 10:49:48 -07:00
Michael Vines	7981431f09	--entrypoint is a global arg	2019-08-12 16:08:45 -07:00
Michael Vines	4e093525c7	Default to error logs, override with info only for those programs that need it (#5321 ) * Revert "Revert "Default log level to to RUST_LOG=solana=info (#5296)" (#5302)" This reverts commit `7796e87814`. * Default to error logs, override with info only for those programs that need it	2019-07-29 10:57:00 -07:00
Michael Vines	7796e87814	Revert "Default log level to to RUST_LOG=solana=info (#5296 )" (#5302 ) This reverts commit `c63a38ae57`.	2019-07-27 07:46:45 -07:00
Michael Vines	c63a38ae57	Default log level to to RUST_LOG=solana=info (#5296 )	2019-07-26 16:29:16 -07:00
Sagar Dhawan	65adce65fa	Always send pull responses to the origin addr (#4894 )	2019-07-01 16:49:05 -07:00
Sagar Dhawan	a0ffbf50a5	Correctly remove replicator from data plane after its done repairing (#4301 ) * Correctly remove replicator from data plane after its done repairing * Update discover to report nodes and replicators separately * Fix print and condition to be spy	2019-05-16 07:14:58 -07:00
Michael Vines	f3f416b7ba	Rename --network argument to --entrypoint (#4149 )	2019-05-03 15:00:19 -07:00
Michael Vines	7fe3c75c6b	Add a node-specific ip echo service to remove dependency on ifconfig.co (#4137 )	2019-05-03 11:01:35 -07:00
Sagar Dhawan	9add8d0afc	Add alternative to Spy Nodes that can fully participate in Gossip (#4087 ) automerge	2019-04-30 16:42:56 -07:00
Michael Vines	05bcb7f292	Add stop node command to solana-gossip (#3928 )	2019-04-22 14:51:20 -07:00
Michael Vines	149d809e86	Minor cli help cleanup (#3786 )	2019-04-15 13:36:14 -07:00
Michael Vines	0767c0c07f	Add DNS resolution to cli tools	2019-04-14 21:25:46 -07:00
Michael Vines	2277a39dd2	Default solana-gossip log-level to 'info'	2019-04-14 07:07:15 -07:00
Tyera Eulberg	af97ad3d68	Add solana-gossip module	2019-04-01 23:05:25 -06:00

1 2 3 4 5

225 Commits